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Abstract 

This  paper  adds  counterfactuals  to  the  framework  of  knowledge-based,  programs 
of  Fagin,  Halpern,  Moses,  and  Vardi  [1995,  1997].  The  use  of  counterfactuals  is 
illustrated  by  designing  a  protocol  in  which  an  agent  stops  sending  messages  once 
it  knows  that  it  is  safe  to  do  so.  Such  behavior  is  difficult  to  capture  in  the  original 
framework  because  it  involves  reasoning  about  counterfactual  executions,  including 
ones  that  are  not  consistent  with  the  protocol.  Attempts  to  formalize  these  notions 
without  counterfactuals  are  shown  to  lead  to  rather  counterintuitive  behavior. 
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1  Introduction 


Knowledge-based  programs,  first  introduced  by  Halpern  and  Fagin  [1989]  and  further 
developed  by  Fagin,  Halpern,  Moses,  and  Vardi  [1995,  1997],  are  intended  to  provide  a 
high-level  framework  for  the  design  and  specification  of  protocols.  The  idea  is  that,  in 
knowledge-based  programs,  there  are  explicit  tests  for  knowledge.  Thus,  a  knowledge- 
based  program  might  have  the  form 

if  K(x  =  0)  then  y  :=  y  +  1  else  skip, 

where  K (x  =  0)  should  be  read  as  “you  know  x  =  0”  and  skip  is  the  action  of  doing 
nothing.  We  can  informally  view  this  knowledge-based  program  as  saying  “if  you  know 
that  x  =  0,  then  set  y  to  y  +  1  (otherwise  do  nothing)”. 

Knowledge-based  programs  are  an  attempt  to  capture  the  intuition  that  what  an 
agent  does  depends  on  what  it  knows.  They  have  been  used  successfully  in  papers  such  as 
[Dwork  and  Moses  1990;  Hadzilacos  1987;  Halpern,  Moses,  and  Waarts  2001;  Halpern  and 
Zuck  1992;  Mazer  and  Lochovsky  1990;  Mazer  1990;  Moses  and  Tuttle  1988;  Neiger  and 
Toueg  1993]  both  to  help  in  the  design  of  new  protocols  and  to  clarify  the  understanding 
of  existing  protocols.  However,  as  we  show  here,  there  are  cases  when,  used  naively, 
knowledge-based  programs  exhibit  some  quite  counterintuitive  behavior.  We  then  show 
how  this  can  be  overcome  by  the  use  of  counterfactuals  [Lewis  1973;  Stalnaker  1968].  In 
this  introduction,  we  discuss  these  issues  informally,  leaving  the  formal  details  to  later 
sections  of  the  paper. 

Some  counterintuitive  aspects  of  knowledge-based  programs  can  be  understood  by 
considering  the  bit- transmission  problem,  from  [Fagin,  Halpern,  Moses,  and  Vardi  1995]. 
In  this  problem,  there  are  two  processes,  a  sender  S  and  a  receiver  R ,  that  communicate 
over  a  communication  line.  The  sender  starts  with  one  bit  (either  0  or  1)  that  it  wants  to 
communicate  to  the  receiver.  The  communication  line  may  be  faulty  and  lose  messages 
in  either  direction  in  any  given  round.  That  is,  there  is  no  guarantee  that  a  message 
sent  by  either  S  or  R  will  be  received.  Because  of  the  uncertainty  regarding  possible 
message  loss,  S  sends  the  bit  to  R  in  every  round,  until  S  receives  an  ack  message 
from  R  acknowledging  receipt  of  the  bit.  R  starts  sending  the  ack  message  in  the  round 
after  it  receives  the  bit,  and  continues  to  send  it  repeatedly  from  then  on.  The  sender  S 
can  be  viewed  as  running  the  program  BT s'- 

if  recack  then  skip  else  send  bit, 

where  recack  is  a  proposition  that  is  true  if  S  has  already  received  an  ack  message  from  R 
and  false  otherwise,  while  sendbit  is  the  action  of  sending  the  bit.1  Note  that  BT^  is  a 
standard  program — it  does  not  have  tests  for  knowledge.  We  can  capture  some  of  the 
intuitions  behind  this  program  by  using  knowledge.  The  sender  S  keeps  sending  the  bit 

1  Running  such  a  program  amounts  to  performing  the  statement  repeatedly  forever. 
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until  an  acknowledgment  is  received  from  the  receiver  R.  Thus,  another  way  to  describe 
the  sender’s  behavior  is  to  say  that  S  keeps  sending  the  bit  until  it  knows  that  the  bit  was 
received  by  R.  This  behavior  can  be  characterized  by  the  knowledge-based  program  BT^: 

if  Ks(recbit)  then  skip  else  sendbit, 

where  recbit  is  a  proposition  that  is  true  once  R  has  received  the  bit.  The  advantage  of 
this  program  over  the  standard  program  BT5  is  that  it  abstracts  away  the  mechanism 
by  which  S  learns  that  the  bit  was  received  by  R.  For  example,  if  messages  from  S  to  R 
are  guaranteed  to  be  delivered  in  the  same  round  in  which  they  are  sent,  then  S  knows 
that  R  received  the  bit  even  if  S  does  not  receive  an  acknowledgment. 

We  might  hope  to  improve  this  even  further.  Consider  a  system  where  all  messages 
sent  are  guaranteed  to  be  delivered,  but  rather  than  arriving  in  one  round,  they  spend 
exactly  five  rounds  in  transit.  In  such  a  system,  a  sender  using  BT 5  will  send  the  bit 
10  times,  because  it  will  take  10  rounds  to  get  the  receiver’s  acknowledgment  after  the 
original  message  is  sent.  The  program  BTg  is  somewhat  better;  using  it  S  sends  the 
bit  only  five  times,  since  after  the  fifth  round,  S  will  know  that  R  got  his  first  message. 
Nevertheless,  this  seems  wasteful.  Given  that  messages  are  guaranteed  to  be  delivered,  it 
clearly  suffices  for  the  sender  to  send  the  bit  once.  Intuitively,  the  sender  should  be  able 
to  stop  sending  the  message  as  soon  as  it  knows  that  the  receiver  will  eventually  receive 
a  copy  of  the  message;  the  sender  should  not  have  to  wait  until  the  receiver  actually 
receives  it. 

It  seems  that  there  should  be  no  problem  handling  this  using  knowledge-based  pro¬ 
grams.  Let  O  be  the  standard  “eventually”  operator  from  temporal  logic  [Manna  and 
Pnueli  1992];  Oip  means  that  <p  is  eventually  true,  and  let  □  be  its  dual,  “always”.  Now 
the  following  knowledge-based  program  BTJ  for  the  sender  should  capture  exactly  what 
is  required: 

if  Ks(0 recbit )  then  skip  else  sendbit. 

Unfortunately,  BT^  does  not  capture  our  intuitions  here.  To  understand  why,  consider 
the  sender  S.  Should  it  send  the  bit  in  the  first  round?  According  to  BTIj,  the  sender 
S  should  send  the  bit  if  S  does  not  know  that  R  will  eventually  receive  the  bit.  But 
if  S  sends  the  bit,  then  S  knows  that  R  will  eventually  receive  it  (since  messages  are 
guaranteed  to  be  delivered  in  5  rounds).  Thus,  S  should  not  send  the  bit.  Similar 
arguments  show  that  S  should  not  send  the  bit  at  any  round.  On  the  other  hand,  if 
S  never  sends  the  bit,  then  R  will  never  receive  it  and  thus  S  should  send  the  bit!  It 
follows  that  according  to  BTlj,  S  should  send  the  bit  exactly  if  it  will  never  send  the  bit. 
Obviously,  there  is  no  way  S  can  follow  such  a  program.  Put  another  way,  this  program 
cannot  be  implemented  by  a  standard  program  at  all.  This  is  certainly  not  the  behavior 
we  would  intuitively  have  expected  of  BTlj.2 

2While  intuitions  may,  of  course,  vary,  some  evidence  of  the  counterintuitive  behavior  of  this  program 
is  that  it  was  used  in  a  draft  of  [Fagin,  Halpern,  Moses,  and  Vardi  1995];  it  was  several  months  before 
we  realized  its  problematic  nature. 
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One  approach  to  dealing  with  this  problem  is  to  change  the  semantics  of  knowledge- 
based  programs.  Inherent  in  the  semantics  of  knowledge-based  programs  is  the  fact  that 
an  agent  knows  what  standard  protocol  she  is  following.  Thus,  if  the  sender  is  guaranteed 
to  send  a  message  in  round  two,  then  she  knows  at  time  one  that  the  message  will  be 
sent  in  the  following  round.  Moreover,  if  communication  is  reliable,  she  also  knows  the 
message  will  later  be  received.  If  we  weaken  the  semantics  of  knowledge  sufficiently, 
then  this  problem  disappears.  (See  [Engelhardt,  van  der  Meyden,  and  Moses  1998]  for  an 
approach  to  dealing  with  the  problem  addressed  in  this  paper  along  these  lines.)  However, 
it  is  not  yet  clear  how  to  make  this  change  and  still  maintain  the  attractive  features  of 
knowledge-based  programs  that  we  discussed  earlier. 

In  this  paper  we  consider  another  approach  to  dealing  with  the  problem,  based  on 
counterfactuals.  Our  claim  is  that  the  program  BT]j  does  not  adequately  capture  our 
intuitions.  Rather  than  saying  that  S  should  stop  sending  if  S  knows  that  R  will  even¬ 
tually  receive  the  bit  we  should,  instead,  say  that  S  should  stop  sending  if  it  knows  that 
even  if  S  does  not  send  another  message  R  will  eventually  receive  the  bit. 

How  should  we  capture  this?  Let  do(i,  a)  be  the  formula  that  is  true  at  a  point  (r,  m) 
if  process  i  performs  a  in  the  next  round.3  The  most  obvious  way  to  capture  “(even) 
if  S  does  not  send  a  message  then  R  will  eventually  receive  the  bit”  uses  standard 
implication,  also  known  as  material  implication  or  material  conditional  in  philosophical 
logic:  do(S,  skip)  =>■  recbit.  This  leads  to  a  program  such  as  BTit: 

if  Ks(do(S,  skip)  Orecbit )  then  skip  else  sendbit. 

Unfortunately,  this  program  does  not  solve  our  problems.  It,  too  is  not  implcmentablc 
by  a  standard  program.  To  see  why,  suppose  that  there  is  some  point  in  the  execution 
of  this  protocol  where  S  sends  a  message.  At  this  point  S  knows  it  is  sending  a  message, 
so  S  knows  that  c/o(aS',  skip)  is  false.  Thus,  S  knows  that  cZo(<S',  skip)  Orecbit  holds. 
As  a  result,  Ks(do(S,  skip)  =>-  Orecbit )  is  true,  so  that  the  test  in  BTit  succeeds.  Thus, 
according  to  BTi?,  the  sender  S  should  not  send  a  message  at  this  point.  On  the  other 
hand,  if  S  never  sends  a  message  according  to  the  protocol  (under  any  circumstance), 
then  S  knows  that  it  will  never  send  a  message  (since,  after  all,  S  knows  how  the  protocol 
works).  But  in  this  case,  S  knows  that  the  receiver  will  never  receive  the  bit,  so  the  test 
fails.  Thus,  according  to  BTit,  the  sender  S  should  send  the  message  as  its  first  action, 
this  time  contradicting  the  assumption  that  the  message  is  never  sent.  Nothing  that  S 
can  do  is  consistent  with  this  program. 

The  problem  here  is  the  use  of  material  implication  (=^).  Our  intuitions  are  better 
captured  by  using  counterfactual  implication,  which  we  denote  by  >.  A  statement  such 
as  <p  >  if  is  read  “if  <p  then  0”,  just  like  tp  0.  However,  the  semantics  of  >  is  very 
different  from  that  of  =4*.  The  idea,  which  goes  back  to  Stalnaker  [1968]  and  Lewis  [1973] 
is  that  a  statement  such  as  p  >  0  is  true  at  a  world  w  if  in  the  worlds  “closest  to”  or 

3We  assume  that  round  m  takes  place  between  time  m  —  1  and  m.  Thus,  the  next  round  after  (r,  to) 
is  round  m  +  1,  which  takes  takes  place  between  (r,  to)  and  (r,  to  +  1). 


3 


“most  like”  w  where  p  is  true,  'if  is  also  true.  This  attempts  to  capture  the  intuition  that 
the  counterfactual  statement  p  >  if  stands  for  “if  p  were  the  case,  then  if  would  hold”. 
For  example,  suppose  that  we  have  a  wet  match  and  we  make  a  statement  such  as  “if  the 
match  were  dry  then  it  would  light”.  Using  this  statement  is  trivially  true,  since  the 
antecedent  is  false.  However,  with  >,  the  situation  is  not  so  obvious.  We  must  consider 
the  worlds  most  like  the  actual  world  where  the  match  is  in  fact  dry  and  decide  whether 
it  would  light  in  those  worlds.  If  we  think  the  match  is  defective  for  some  reason,  then 
even  if  it  were  dry,  it  would  not  light. 

A  central  issue  in  the  application  of  counterfactual  reasoning  to  a  concrete  problem  is 
that  we  need  to  specify  what  the  “closest  worlds”  are.  The  philosophical  literature  does 
not  give  us  any  guidance  on  this  point.  We  present  some  general  approaches  for  doing 
so,  motivated  by  our  interest  in  modeling  counterfactual  reasoning  about  what  would 
happen  if  an  agent  were  to  deviate  from  the  protocol  it  is  following.  We  believe  that  this 
example  can  inform  similar  applications  of  counterfactual  reasoning  in  other  contexts. 

There  is  a  subtle  technical  point  that  needs  to  be  addressed  in  order  to  use  counter- 
factuals  in  knowledge-based  programs.  Traditionally,  we  talk  about  a  knowledge-based 
program  Pgfc6  being  implemented  by  a  protocol  P.  This  is  the  case  when  the  behavior 
prescribed  by  P  is  in  accordance  with  what  Pgfcf)  specifies.  To  determine  whether  P 
implements  Pgfc6 ,  the  knowledge  tests  (tests  for  the  truth  of  formulas  of  the  form  Kip) 
in  Pgkb  are  evaluated  with  respect  to  the  points  appearing  in  the  set  of  runs  of  P.  In 
this  system,  all  the  agents  know  that  the  properties  of  P  (e.g.  facts  like  process  1  al¬ 
ways  sending  an  acknowledgment  after  receiving  a  message  from  process  2)  hold  in  all 
runs.  But  this  set  of  runs  does  not  account  for  what  may  happen  if  (counter  to  fact) 
some  agents  were  to  deviate  from  P.  In  counterfactual  reasoning,  we  need  to  evaluate 
formulas  with  respect  to  a  larger  set  of  runs  that  allows  for  such  deviations. 

We  deal  with  this  problem  by  evaluating  counterfactuals  with  respect  to  a  system 
consisting  of  all  possible  runs  (not  just  the  ones  generated  by  P).  While  working  with 
this  larger  system  enables  us  to  reason  about  counterfactuals,  processes  no  longer  know 
the  properties  of  P  in  this  system,  since  it  includes  many  runs  not  in  P.  In  order  to 
deal  with  this,  we  add  a  notion  of  likelihood  to  the  system  using  what  are  called  ranking 
functions  [Spohn  1988].  Runs  generated  by  P  get  rank  0;  all  other  runs  get  higher  rank. 
(Lower  ranks  imply  greater  likelihood.)  Ranks  let  us  define  a  standard  notion  of  belief. 
Although  a  process  does  not  know  that  the  properties  of  P  hold,  it  believes  that  they 
do.  Moreover,  when  restricted  to  the  set  of  runs  of  the  original  protocol  P ,  this  notion  of 
belief  satisfies  the  knowledge  axiom  B.^p  =>-  p ,  and  coincides  with  the  notion  of  knowledge 
we  had  in  the  original  system.  Thus,  when  the  original  protocol  is  followed,  our  notion 
of  belief  acts  essentially  like  knowledge. 

Using  the  counterfactual  operator  and  this  interpretation  for  belief,  we  get  the  pro¬ 
gram  BT5: 

if  Bs(do(S,  skip)  >  Orecbit)  then  skip  else  sendbit. 

We  show  that  using  counterfactuals  in  this  way  has  the  desired  effect  here.  If  message 
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delivery  is  guaranteed,  then  after  the  message  has  been  sent  once,  under  what  seems 
to  be  the  most  reasonable  interpretation  of  “the  closest  world”  where  the  message  is 
not  sent,  the  sender  believes  that  the  bit  will  eventually  be  received.  In  particular,  in 
contexts  where  messages  are  delivered  in  five  rounds,  using  BT^,  the  sender  will  send  one 
message. 

As  we  said,  one  advantage  of  BTg  over  the  standard  program  BT5  is  that  it  abstracts 
away  the  mechanism  by  which  S  learns  that  the  bit  was  received  by  R.  We  can  abstract 
even  further.  The  reason  that  S  keeps  sending  the  bit  to  R  is  that  S  wants  R  to  know 
the  value  of  the  bit.  Thus,  intuitively,  S  should  keep  sending  the  bit  until  it  knows  that 
R  knows  its  value.  Let  KR{bit)  be  an  abbreviation  for  KR{bit  =  0)  V  KR(bit  =  1),  so 
Kii(bit)  is  true  precisely  if  R  knows  the  value  of  the  bit.  The  sender’s  behavior  can  be 
characterized  by  the  following  knowledge-based  program,  BT^ : 

if  KsKnibit)  then  skip  else  sendbit. 

Clearly  when  a  message  stating  the  value  of  the  bit  reaches  the  receiver,  KR{bit )  holds. 
But  it  also  holds  in  other  circumstances.  If,  for  example,  the  KsKnibit)  holds  initially, 
then  there  is  no  need  to  send  anything. 

As  above,  it  seems  more  efficient  for  the  sender  to  stop  sending  when  he  knows  that 
the  receiver  will  eventually  know  the  value  of  the  bit.  This  suggests  using  the  following 
program: 

if  Ks(do(S,  skip)  =>-  OKR(bit))  then  skip  else  sendbit. 

However,  the  same  reasoning  as  in  the  case  of  BT>  shows  that  this  program  is  not 
implementable.  And,  again,  using  belief  and  counterfactuals,  we  can  get  a  program 
BT^b  that  does  work,  and  uses  fewer  messages  than  BT|.  In  fact,  the  following  program 
does  the  job: 


if  Bs(do(S,  skip)  >  O BR(bit))  then  skip  else  sendbit, 

except  that  now  we  have  to  take  BR(bit )  to  be  an  abbreviation  for  ( bit  =  0  A  BR(bit  = 
0))  V  ( bit  =  1  A  BR{bit  =  1)).  Note  that  KR{bit ),  which  was  defined  to  be  KR(bit  =  0))  V 
KR(bit  =  1)),  is  logically  equivalent  to  ( bit  =  0  A KR{bit  =  0))  V  ( bit  =  1  f\KR{bit  =  1)), 
since  KRp  =$■  p  is  valid  for  any  formula  p.  But,  in  general,  BRp  p  is  not  valid,  so 
adding  the  additional  conjuncts  in  the  case  of  belief  makes  what  turns  out  to  be  quite  an 
important  difference.  Intuitively,  BR(bit )  says  that  R  has  correct  beliefs  about  the  value 
of  the  bit. 

The  rest  of  this  paper  is  organized  as  follows:  In  the  next  section,  there  is  an  informal 
review  of  the  semantics  of  knowledge-based  programs.  Section  3  extends  the  knowledge- 
based  framework  by  adding  counterfactuals  and  beliefs.  We  then  formally  analyze  the 
programs  BTij  and  BT^b,  showing  that  they  have  the  appropropriate  properties.  We 
conclude  in  Section  4. 
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2  Giving  semantics  to  knowledge-based  programs 


Formal  semantics  for  knowledge-based  programs  are  provided  by  Fagin,  Halpern,  Moses, 
and  Vardi  [1995,  1997].  To  keep  the  discussion  in  this  paper  at  an  informal  level,  we 
simplify  things  somewhat  here,  and  review  what  we  hope  will  be  just  enough  of  the 
details  so  that  the  reader  will  be  able  to  follow  the  main  points.  All  the  definitions  in 
this  section,  except  that  of  de  facto  implementation  at  the  end  of  the  section,  are  taken 
from  [Fagin,  Halpern,  Moses,  and  Vardi  1995]. 

Informally,  we  view  a  multi-agent  system  as  consisting  of  a  number  of  interacting 
agents.  We  assume  that,  at  any  given  point  in  time,  each  agent  in  the  system  is  in  some 
local  state.  A  global  state  is  just  a  tuple  consisting  of  each  agent’s  local  state,  together 
with  the  state  of  the  environment,  where  the  environment’s  state  accounts  for  everything 
that  is  relevant  to  the  system  that  is  not  contained  in  the  state  of  the  processes.  The 
agents’  local  states  typically  change  over  time,  as  a  result  of  actions  that  they  perform. 
A  run  is  a  function  from  time  to  global  states.  Intuitively,  a  run  is  a  complete  description 
of  what  happens  over  time  in  one  possible  execution  of  the  system.  A  point  is  a  pair 
(■ r,m )  consisting  of  a  run  r  and  a  time  m.  If  r(m)  =  (£e,£ i, . . .  ,£n),  then  we  use  77  (m) 
to  denote  process  V s  local  state  ii  at  the  point  (r,m),  i  —  1, . . .  ,n  and  re(m)  to  denote 
the  environment’s  state  te.  For  simplicity,  time  here  is  taken  to  range  over  the  natural 
numbers  rather  than  the  reals  (so  that  time  is  viewed  as  discrete,  rather  than  dense  or 
continuous).  Round  m  in  run  r  occurs  between  time  m  —  1  and  m.  A  system  77  is  a 
set  of  runs;  intuitively,  these  runs  describe  all  the  possible  executions  of  the  system.  For 
example,  in  a  poker  game,  the  runs  could  describe  all  the  possible  deals  and  bidding 
sequences. 

Of  major  interest  in  this  paper  are  the  systems  that  we  can  associate  with  a  program. 
To  do  this,  we  must  first  associate  a  system  with  a  joint  protocol.  A  protocol  is  a  function 
from  local  states  to  nonempty  sets  of  actions.  (We  often  consider  deterministic  protocols, 
in  which  a  local  state  is  mapped  to  a  singleton  set  of  actions.  Such  protocols  can  be  viewed 
as  functions  from  local  states  to  actions.)  A  joint  protocol  is  just  a  set  of  protocols,  one 
for  each  pro  cess/ agent. 

We  would  like  to  be  able  to  generate  the  system  corresponding  to  a  given  joint  pro¬ 
tocol  P.  To  do  this,  we  need  to  describe  the  setting,  or  context,  in  which  P  is  being 
executed.  Formally,  a  context  7  is  a  tuple  (Pe,  Go,  t,  \k),  where  Pe  is  a  protocol  for  the 
environment,  Go  is  a  set  of  initial  global  states,  r  is  a  transition  function,  and  T  is  a  set 
of  admissible  runs.  The  environment  is  viewed  as  running  a  protocol  just  like  the  agents; 
its  protocol  is  used  to  capture  features  of  the  setting  such  as  “all  messages  are  delivered 
within  5  rounds”  or  “messages  may  be  lost”.  The  transition  function  r  describes  how 
the  actions  performed  by  the  agents  and  the  environment  change  the  global  state  by 
associating  with  each  joint  action  (a  tuple  consisting  of  an  action  for  the  environment 
and  one  for  each  of  the  agents)  a  global  state  transformer,  that  is,  a  mapping  from  global 
states  to  global  states.  For  the  simple  programs  considered  in  this  paper,  the  transition 
function  will  be  almost  immediate  from  the  description  of  the  global  states.  The  set  T  of 
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admissible  runs  is  useful  for  capturing  various  fairness  properties  of  the  context.  Typi¬ 
cally,  when  no  fairness  constraints  are  imposed,  T  is  the  set  of  all  runs.  (For  a  discussion 
of  the  role  of  the  set  T  of  admissible  runs  see  [Fagin,  Halpern,  Moses,  and  Vardi  1995].) 
Since  our  focus  in  this  paper  is  reasoning  about  actions  and  when  they  are  performed, 
we  assume  that  all  contexts  are  such  that  the  environment’s  state  at  the  point  (r,  m) 
records  the  joint  action  performed  in  the  previous  round  (that  is,  between  (r,  m  —  1) 
and  (r,m)).  (Thus,  we  are  essentially  considering  what  are  called  recording  contexts  in 
[Fagin,  Halpern,  Moses,  and  Vardi  1995].) 

A  run  r  is  consistent  with  a  protocol  P  if  it  could  have  been  generated  when  running 
protocol  P.  Formally,  run  r  is  consistent  with  joint  protocol  P  in  context  7  if  r  G  T  (so  r 
is  admissible  according  to  the  context  7),  its  initial  global  state  r(0)  is  one  of  the  initial 
global  states  given  in  7,  and  for  all  m,  the  transition  from  global  state  r(m)  to  r(m+l) 
is  the  result  of  performing  one  of  the  joint  actions  specified  by  P  and  the  environment 
protocol  Pe  (given  in  7)  in  the  global  state  r(m).  That  is,  if  P  =  (Pi, . . . ,  Pn),  Pe  is 
the  environment’s  protocol  in  context  7,  and  r(m )  =  (£e,  l\, . . . ,  £n),  then  there  must 
be  a  joint  action  (ae,  al5 . . . ,  an)  such  that  ae  G  Pe(£e)i  a*  £  Pi(£i)  for  i  =  1, . . . ,  n,  and 
r(m  +  1)  =  r(ae,ai, . . . ,  a n)(r(m))  (so  that  r(m  +  1)  is  the  result  of  applying  the  joint 
action  (ae,  ai, . . . ,  an)  to  r(m ).  For  future  reference,  we  will  say  that  a  run  r  is  consistent 
with  7  if  r  is  consistent  with  some  joint  protocol  P  in  7.  A  system  TZ  represents  a  joint 
protocol  P  in  a  context  7  if  it  consists  of  all  runs  in  T  consistent  with  P  in  7.  We  use 
R(P,  7)  to  denote  the  system  representing  P  in  context  7. 

The  basic  logical  language  C  that  we  use  is  a  standard  propositional  temporal  logic. 
We  start  out  with  a  set  <f>  of  primitive  propositions  p,q, . . .  (which  are  sometimes  given 
more  meaningful  names  such  as  rechit  or  recack).  Every  primitive  proposition  is  consid¬ 
ered  to  be  a  formula  of  C.  We  close  off  under  the  Boolean  operators  A  (conjunction)  and 
-1  (negation).  Thus,  if  ip  and  are  formulas  of  C,  then  so  are  -1  p  and  p  A  i/j.  The  other 
Boolean  operators  are  definable  in  terms  of  these.  E.g.,  implication  p  =y  ip  is  defined 
as  ~i(-'P  A  if}).  Finally,  we  close  off  under  temporal  operators.  For  the  purposes  of  this 
paper,  it  suffices  to  consider  the  standard  linear-time  temporal  operators  O  ( “in  the  next 
(global)  state’)’  and  O  (“eventually”):  If  p  is  a  formula,  then  so  are  Qp  and  Op.  The 
dual  of  O,  which  stands  for  “forever,”  is  denoted  by  □  and  defined  to  be  shorthand  for 
— iO — 1.  This  completes  the  definition  of  the  language. 

In  order  to  assign  meaning  to  the  formulas  of  such  a  language  C  in  a  system  TZ,  we 
need  an  interpretation  n,  which  determines  the  truth  of  the  primitive  propositions  at 
each  of  the  global  states  of  TZ.  Thus,  n  :  $  x  Q  — >  {true,  false},  where  n (p,g)  =  true 
exactly  if  the  proposition  p  is  true  at  the  global  state  g.  An  interpreted  system  is  a  pair 
Z  =  (1Z,  7 r)  where  TZ  is  a  system  as  before,  and  n  is  an  interpretation  for  <f>  in  1Z.  Formulas 
of  C  are  considered  true  or  false  at  a  point  (r,  m)  with  respect  to  an  interpreted  system 
X  =  (TZ,  7 r)  where  r  e  TZ.  Formally, 

•  (X,  r,  m )  |=  p,  for  p  G  <f>,  iff  n(p,  r(m ))  =  true. 

•  (X,  r,  m)  |=  ->p,  iff  (X,  r,  m)  ^  p. 
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•  (X,  r,  m)  |=  ip  A  ip,  iff  both  (X,  r,  m)  \=  ip  and  (X,  r,  m)  f=  ip. 

•  (X,  r,  m)  |=  Ox>  iff  G  m  +  1)  |=  ip. 

•  (X,  r,  m)  |=  0(/?,  iff  (X,  r,  m')  |=  <p  for  some  ml  >  m. 

By  adding  an  interpretation  7r  to  the  context  7,  we  obtain  an  interpreted  context  (7, 7r). 

We  now  describe  a  simple  programming  language,  introduced  in  [Fagin,  Halpern, 
Moses,  and  Vardi  1995],  which  is  still  rich  enough  to  describe  protocols,  and  whose 
syntax  emphasizes  the  fact  that  an  agent  performs  actions  based  on  the  result  of  a  test 
that  is  applied  to  her  local  state.  A  ( standard )  program,  for  agent  i  is  a  statement  of  the 
form: 

case  of 

if  ti  do  a! 
if  t2  do  a2 


end  case 


where  the  t/s  are  standard  tests  for  agent  i  and  the  a/s  are  actions  of  agent  i  (i.e. , 
a j  e  ACTi).  (We  later  modify  these  programs  to  obtain  knowledge-based  and  belief- 
based  programs;  the  distinction  will  come  from  the  kinds  of  tests  allowed.  We  omit  the 
case  statement  if  there  is  only  one  clause.)  A  standard  test  for  agent  i  is  simply  a 
propositional  formula  over  a  set  <3>j  of  primitive  propositions.  Intuitively,  if  Lj  represents 
the  local  states  of  agent  i  in  Q,  then  once  we  know  how  to  evaluate  the  tests  in  the 
program  at  the  local  states  in  Ll}  we  can  convert  this  program  to  a  protocol  over  Lp  at 
a  local  state  l,  agent  %  nondeterministically  chooses  one  of  the  (possibly  infinitely  many) 
clauses  in  the  case  statement  whose  test  is  true  at  i,  and  executes  the  corresponding 
action. 

We  want  to  use  an  interpretation  n  to  tell  us  how  to  evaluate  the  tests.  However,  not 
just  any  interpretation  will  do.  We  intend  the  tests  in  a  program  for  agent  i  to  be  local, 
that  is,  to  depend  only  on  agent  i’s  local  state.  It  would  be  inappropriate  for  agent  i’s 
action  to  depend  on  the  truth  value  of  a  test  that  i  could  not  determine  from  her  local 
state.  An  interpretation  n  on  the  global  states  in  Q  is  compatible  with  a  program  Pg.- 
for  agent  i  if  every  proposition  that  appears  in  Pgj  is  local  to  i;  that  is,  if  q  appears 
in  Pgj,  the  states  s  and  s'  are  in  Q ,  and  s  s',  then  ir(s)(q)  =  n(s')(q).  If  ip  is  a 
propositional  formula  all  of  whose  primitive  propositions  are  local  to  agent  i,  and  i  is  a 
local  state  of  agent  i,  then  we  write  {it,  £)  |=  ip  if  ip  is  satisfied  by  the  truth  assignment 
7r(s),  where  s  =  (se,  Si, . . . ,  sn)  is  a  global  state  such  that  s,  =  i.  Because  all  the  primitive 
propositions  in  ip  are  local  to  i,  it  does  not  matter  which  global  state  s  we  choose,  as 
long  as  i’s  local  state  in  s  is  t.  Given  a  program  Pg,:  for  agent  i  and  an  interpretation  it 
compatible  with  Pg,:,  we  define  a  protocol  that  we  denote  Pg*  by  setting: 


{ay  :  (7 T,e)  1=  tj}  if  {j  :  (n,£)  |=  tj}  ^  0 
{skip}  if  {j  :  (tt,€)  |=  tj}  =  0. 
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Pg?(*) 


Intuitively,  Pg7"  selects  all  actions  from  the  clauses  that  satisfy  the  test,  and  selects  the 
null  action  skip  if  no  test  is  satisfied.  In  general,  we  get  a  nondeterministic  protocol,  since 
more  than  one  test  may  be  satisfied  at  a  given  state. 

Many  of  the  definitions  that  we  gave  for  protocols  have  natural  analogues  for  pro¬ 
grams.  We  define  a  joint  program  to  be  a  tuple  Pg  =  (Pgl5 . . . ,  Pgn),  where  Pg,  is  a 
program  for  agent  i.  An  interpretation  n  is  compatible  with  Pg  if  n  is  compatible  with 
each  of  the  Pg,’s.  From  Pg  and  n  we  get  a  joint  protocol  Pg7"  =  (Pg]", . . . ,  Pg]"t).  We  say 
that  an  interpreted  system  X  =  (1Z,  it)  represents  a  joint  program  Pg  in  the  interpreted 
context  (7, 7 r)  exactly  if  7 r  is  compatible  with  Pg  and  X  represents  the  corresponding 
protocol  Pg7".  We  denote  the  interpreted  system  representing  Pg  in  (7,77)  by  I(Pg,  7, 7r). 
Of  course,  this  definition  only  makes  sense  if  n  is  compatible  with  Pg.  From  now  on  we 
always  assume  that  this  is  the  case. 

The  syntactic  form  of  our  standard  programs  is  in  many  ways  more  restricted  than 
that  of  programs  in  common  programming  languages  such  as  C  or  FORTRAN.  In  such 
languages,  one  typically  sees  constructs  such  as  for,  while,  or  if.  .  .  then.  .  .  else.  .  .  , 
which  do  not  have  syntactic  analogues  in  our  formalism.  As  discussed  in  [Fagin,  Halpern, 
Moses,  and  Vardi  1995],  it  is  possible  to  encode  a  program  counter  in  tests  and  actions 
of  standard  programs.  By  doing  so,  it  is  possible  to  simulate  these  constructs.  Hence, 
there  is  essentially  no  loss  of  generality  in  our  definition  of  standard  programs. 

Since  each  test  in  a  standard  program  Pg  run  by  process  i  can  be  evaluated  in  each 
local  state,  we  can  derive  a  protocol  from  Pg  in  an  obvious  way:  to  find  out  what  pro¬ 
cess  i  does  in  a  local  state  £,  we  evaluate  the  tests  in  the  program  in  £  and  perform  the 
appropriate  action.  A  run  is  consistent  with  Pg  in  interpreted  context  (7, 7r)  if  it  is  consis¬ 
tent  with  the  protocol  derived  from  Pg.  Similarly,  a  system  represents  Pg  in  interpreted 
context  (7, 7r)  if  it  represents  the  protocol  derived  from  Pg  in  (7,77). 

Example  2.1  Consider  the  (joint)  program  BT  =  (BT5,  BT#),  where  BT5  is  as  defined 
in  the  introduction,  and  BT #  is  the  program 

if  recbit  then  sendack  else  skip. 

Thus,  in  BT/j,  the  receiver  sends  an  acknowledgement  if  it  has  received  the  bit,  and 
otherwise  does  nothing.  This  program,  like  all  the  programs  considered  in  this  paper, 
is  applied  repeatedly,  so  it  effectively  runs  forever.  Assume  that  S' s  local  state  in¬ 
cludes  the  time,  its  input  bit,  and  whether  or  not  S  has  received  an  acknowledgment 
from  R ;  the  state  thus  has  the  form  (m,i,x),  where  m  is  a  natural  number  (the  time), 
i  G  {0, 1}  is  the  input  bit,  and  x  G  {A,  ack}.  Similarly,  R's  local  state  has  the  form 
(m,  x),  where  m  is  the  time  and  x  is  either  A,  0,  or  1,  depending  on  whether  or  not  it  has 
received  the  bit  from  S  and  what  the  bit  is.  As  in  all  recording  contexts,  the  environ¬ 
ment  state  keeps  track  of  the  actions  performed  by  the  agents.  Since  the  environment 
state  plays  no  role  here,  we  omit  it  from  the  description  of  the  global  state,  and  just 
identify  the  global  state  with  the  pair  consisting  of  S  and  R's  local  state.  Suppose  that, 


9 


in  context  7,  the  environment  protocol  nondeterministically  decides  whether  or  not  a 
message  sent  by  S  and/or  R  is  delivered,  the  initial  global  states  are  ((0,  0,  A),  (0,  A))  and 
((0, 1,  A),  (0,  A)),  the  transition  function  is  such  that  the  joint  actions  have  the  obvious 
effect  on  the  global  state,  and  all  runs  are  admissible.  Then  a  run  consistent  with  BT 
in  (7, 7r )  in  which  S’ s  bit  is  0,  R  receives  the  bit  in  the  second  round,  and  S  receives 
an  acknowledgment  from  R  in  the  third  round  has  the  following  sequence  of  global  states: 
((0,  0,  A),  (0,  A)),  ((1,  0,  A),  (1,  A)),  ((2,  0,  A),  (2,  0)),  ((3,  0,  ack),  (3,  0)),  ((4,  0,  ack),  (4,  0)), . . .. 


Now  we  consider  knowledge-based  programs.  We  start  by  extending  our  logical  lan¬ 
guage  by  adding  a  modal  operator  K%  for  every  agent  i  =  1 , ,n.  Thus,  whenever  p 
is  a  formula,  so  is  Ktip.  Let  Ck  be  the  resulting  language.  According  to  the  standard 
definition  of  knowledge  in  systems  [Fagin,  Halpern,  Moses,  and  Vardi  1995],  an  agent  i 
knows  a  fact  ip  at  a  given  point  (r,  m)  in  interpreted  system  1  =  ( 1Z ,  7 r)  if  ip  is  true  at  all 
points  in  1Z  where  i  has  the  same  local  state  as  it  does  at  (r,  m).  We  now  have 

•  (X,  r,  m)  |=  K,/p  if  ( X,r',m ')  \—  p  for  all  points  ( r' pm ')  such  that  ?y(m)  =  r'(m'). 

Thus,  i  knows  p  at  the  point  (r,  m)  if  p  holds  at  all  points  consistent  with  i’s  information 
at  (r,  m) . 

A  knowledge-based  program  has  the  same  structure  as  a  standard  program  except 
that  all  tests  in  the  program  text  Pg,;  for  agent  i  are  formulas  of  the  form  As 

for  standard  programs,  we  can  define  when  a  protocol  implements  a  knowledge-based 
program,  except  this  time  it  is  with  respect  to  an  interpreted  context.  The  situation 
in  this  case  is,  however,  somewhat  more  complicated.  In  a  given  context,  a  process 
can  determine  the  truth  of  a  standard  test  such  as  “x  =  0”  by  simply  checking  its 
local  state.  However,  the  truth  of  the  tests  for  knowledge  that  appear  in  knowledge- 
based  programs  cannot  in  general  be  determined  simply  by  looking  at  the  local  state 
in  isolation.  We  need  to  look  at  the  whole  system.  As  a  consequence,  given  a  run,  we 
cannot  in  general  determine  if  it  is  consistent  with  a  knowledge-based  program  in  a  given 
interpreted  context.  This  is  because  we  cannot  tell  how  the  tests  for  knowledge  turn  out 
without  being  given  the  other  possible  runs  of  the  system;  what  a  process  knows  at  one 
point  will  depend  in  general  on  what  other  points  are  possible.  This  stands  in  sharp 
contrast  to  the  situation  for  standard  programs. 

This  means  it  no  longer  makes  sense  to  talk  about  a  run  being  consistent  with  a 
knowledge-based  program  in  a  given  context.  However,  notice  that,  given  an  interpreted 
system  1  =  we  can  derive  a  protocol  from  a  knowledge-based  program  Pgkb 

for  process  i  by  evaluating  the  knowledge  tests  in  Pg^  with  respect  to  X.  That  is,  a 
test  such  as  Kpp  holds  in  a  local  state  t  if  p  holds  at  all  points  (r,  m)  in  X  such  that 

4A11  standard  programs  can  be  viewed  as  knowledge-based  programs.  Since  all  the  tests  in  a  standard 
program  for  agent  i  must  be  local  to  i,  every  test  ip  in  a  standard  program  for  agent  i  is  equivalent  to 
Kiip. 
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Ti(m)  =  Ef  In  general,  different  protocols  can  be  derived  from  a  given  knowledge-based 
program,  depending  on  what  system  we  use  to  evaluate  the  tests.  Let  Pgf6  denote  the 
protocol  derived  from  Pg^j,  by  using  1  to  evaluate  the  tests  for  knowledge.  An  interpreted 
system  1  represents  the  knowledge-based  program  Pgfcfe  in  interpreted  context  (7, 7 r)  if 
1  represents  the  protocol  Pgf6.  That  is,  1  represents  Pg^  if  1  =  I(Pgfb, 7, 7r).  Thus, 
a  system  represents  Pgfc6  if  it  satisfies  a  certain  fixed-point  equation.  A  protocol  P 
implements  Pgkb  in  interpreted  context  (7,  n)  if  P  =  Pg1^'1'^. 

This  definition  is  somewhat  subtle,  and  determining  the  protocol(s)  implementing  a 
given  knowledge-based  program  may  be  nontrivial.  Indeed,  as  shown  by  Fagin,  Halpern, 
Moses,  and  Vardi  [1995,  1997],  in  general,  there  may  be  no  protocols  implementing  a 
knowledge-based  program  Pgfcb  in  a  given  context,  there  may  be  only  one,  or  there  may 
be  more  than  one,  since  the  fixed-point  equation  may  have  no  solutions,  one  solution, 
or  many  solutions.  In  particular,  it  is  not  hard  to  show  that  there  is  no  (joint)  pro¬ 
tocol  implementing  a  (joint)  program  where  S  uses  BT^  or  BTjt,  as  described  in  the 
introduction. 

For  the  purposes  of  this  paper,  it  is  useful  to  have  a  notion  slightly  weaker  than  that 
of  implementation.  Two  joint  protocols  P  =  (Pi,...,Pn)  and  P'  =  (P[,...,P^)  are 
equivalent  in  context  7,  denoted  P  ~7  P',  if  (a)  R(P,  7)  =  R(P',7)  and  (b)  PfE)  = 
P{(£)  for  every  local  state  £  =  rfm)  with  r  e  R(P, 7).  Thus,  two  protocols  that  are 
equivalent  in  7  may  disagree  on  the  actions  performed  in  some  local  states,  provided 
that  those  local  states  never  arise  in  the  actual  runs  of  these  protocols  in  7.  We  say 
P  de  facto  implements  a  knowledge-based  program  Pgfcft  in  interpreted  context  (7,  n)  if 
P  ~7  Pg*feP’7 ,7r  •  Arguably,  de  facto  implementation  suffices  for  most  purposes,  since  all 
we  care  about  are  the  runs  generated  by  the  protocol.  We  do  not  care  about  the  behavior 
of  the  protocol  on  local  states  that  never  arise. 

It  is  almost  immediate  from  the  definition  that  if  P  implements  Pgfc6,  then  P  de  facto 
implements  Pgfc6.  The  converse  may  not  be  true,  since  we  may  have  P  ~7  Pg^P,7,7r^ 
without  having  P  =  Pg^P,7,7r\  On  the  other  hand,  as  the  following  lemma  shows,  if  P 
de  facto  implements  Pg^,  then  a  protocol  closely  related  to  P  implements  Pg^. 

Lemma  2.2  If  P  de  facto  implements  Pgkb  in  (7,  n)  then  Pg[.feP’7,7r  implements  Pgkb  in 
(7) 7r)- 


Proof  Suppose  that  P  de  facto  implements  Pgkb  in  (7, 7r).  Let  P'  =  Pg^P,7,7r\  By 
definition,  P'  ~7  P.  Thus,  I(P',  7, 7r)  =  I(P,  7, 7r),  so  P'  =  PgP P  ’7-7r) .  It  follows  that  P' 
implements  Pgfcfe.  I 


5Note  that  if  there  is  no  point  (r,  m)  in  X  such  that  rffm)  =  then  K,/~P  vacuously  holds  at  £,  for  all 
formulas  p. 
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3  Counterfactuals  and  Belief 


In  this  section,  we  show  how  counterfactuals  and  belief  can  be  added  to  the  knowledge- 
based  framework,  and  use  them  to  do  a  formal  analysis  of  the  programs  BT^  and  BT^b 
from  the  introduction. 

3.1  Counterfactuals 

The  semantics  we  use  for  counterfactuals  is  based  on  the  standard  semantics  used  in 
the  philosophy  literature  [Lewis  1973;  Stalnaker  1968].  As  with  other  modal  logics,  this 
semantics  starts  with  a  set  W  of  possible  worlds.  For  every  possible  world  w  G  W 
there  is  a  (partial)  order  <w  defined  on  W.  Intuitively,  w\  <w  w2  if  w\  is  “closer”  or 
“more  similar”  to  world  w  than  tc2  is.  This  partial  order  is  assumed  to  satisfy  certain 
constraints,  such  as  the  condition  that  w  <w  u/  for  all  w'  ^  w :  world  w  is  closer  to  w 
than  any  other  world  is.  A  counterfactual  statement  of  the  form  p  >  if  is  then  taken  to 
be  true  at  a  world  w  if,  in  all  the  worlds  closest  to  w  among  the  worlds  where  p  is  true, 
ijj  is  also  true. 

In  our  setting,  we  obtain  a  notion  of  closeness  by  associating  with  every  point  (r,  m) 
of  a  system  X  a  partial  order  on  the  points  of  X.6  An  order  assignment  for  a  sys¬ 
tem  X  =  (7 Z,  tv )  is  a  function  <c  that  associates  with  every  point  (r,  m)  of  X  a  partial 
order  relation  <C (r,m)  over  the  points  of  X.  The  partial  orders  must  satisfy  the  constraint 
that  (r,  m)  is  a  minimal  element  of  C so  that  there  is  no  run  r'  G  1Z  and  time  m'  >  0 
satisfying  ( r' pm ')  <C(r,m)(r',  m).  A  counterfactual  system  is  a  pair  of  the  form  J  =  (X,  <c), 
where  X  is  an  interpreted  system  as  before,  while  «C  is  an  order  assignment  for  the  points 
in  X.  Given  a  counterfactual  system  J  =  (X,  <),  a  point  (r,  m)  in  X,  and  a  set  A  of 
points  of  X,  define 

closest(A,  (r,  m),  J)  = 

{( r',m ')  G  A  :  there  is  no  ( r" pm ")  G  A  such  that  ( r" pm ")  C(r)m) 

Thus,  closest(A,  (r,  m ),  J)  consists  of  the  closest  points  to  (r,  m )  among  the  points  in  A 
(according  to  the  order  assignment  <c). 

To  allow  for  counterfactual  statements,  we  extend  our  logical  language  C  with  a  binary 
operator  >  on  formulas,  so  that  whenever  p  and  if  are  formulae,  so  is  p  >  if.  We  read 
p  >  if  as  “if  p  were  the  case,  then  i/j,”  and  denote  the  resulting  language  by  C>. 

Let  [<^]  =  {(r,  m)  :  (, J,r,m )  |=  p}]  that  is,  [<^]  consists  of  all  points  in  J  satisfying 
p.  We  can  now  define  the  semantics  of  counterfactuals  as  follows: 

{J ,r,m)  |=  p  >  if  if  (J,r' ,m!)  |=  if  for  all  ( r' pm ')  G  closest ([<£>],  {r,m),J). 

6In  a  more  general  treatment,  we  could  associate  a  different  partial  order  with  every  agent  at  every 
point;  this  is  not  necessary  for  the  examples  we  consider  in  this  paper. 
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This  definition  captures  the  intuition  for  counterfactuals  stated  earlier:  (p  >  0  is  true  at 
a  point  (r,  m)  if  0  is  true  at  the  points  closest  to  (r,  m)  where  <p  is  true. 

All  earlier  analyses  of  (epistemic)  properties  of  a  protocol  P  in  a  context  7  used  the 
interpreted  system  I(P,  7, 7r),  consisting  of  all  the  runs  consistent  with  P  in  context  7. 
However,  counterfactual  reasoning  involves  events  that  occur  on  runs  that  are  not  con¬ 
sistent  with  P.  To  support  such  reasoning  we  need  to  consider  runs  not  in  I(P,  7, 7r). 
The  runs  that  must  be  added  can,  in  general,  depend  on  the  type  of  counterfactual  state¬ 
ments  allowed  in  the  logical  language.  Thus,  for  example,  if  we  allow  formulas  of  the 
form  do(i,  a)  >  if  for  process  i  and  action  a,  then  we  must  allow,  at  every  point  of  the 
system,  a  possible  future  in  which  f’s  next  action  is  a.' 

An  even  richer  set  of  runs  is  needed  if  we  allow  the  language  to  specify  a  sequence  of 
actions  performed  by  a  given  process,  or  if  counterfactual  conditionals  >  can  be  nested. 
To  handle  a  broad  class  of  applications,  including  ones  involving  formulas  with  tempo¬ 
ral  operators  and  arbitrary  nesting  of  conditional  statements  involving  do(i,  a),  we  do 
reasoning  with  respect  to  the  system  X+(7,7r)  =  (P+(7),  7r)  consisting  of  all  runs  com¬ 
patible  7,  that  is,  all  runs  consistent  with  some  protocol  P'  in  context  7.  In  this  way  all 
possible  behaviors,  within  the  constraints  induced  by  7,  can  be  reasoned  about.  There  is 
a  potential  problem  with  using  system  X+(y,  7 r)  =  ( P+(7), 7r)  for  reasoning  about  P:  all 
reference  to  P  has  been  lost.  We  return  to  this  issue  in  the  next  section,  when  we  discuss 
belief.  For  now  we  show  how  to  use  X+(y,  n)  as  a  basis  for  doing  counterfactual  reasoning. 

As  we  have  already  discussed,  the  main  issue  in  using  X+(7,7r)  to  reason  about  P  is 
that  of  defining  an  appropriate  order  assignment.  We  are  interested  in  order  assignments 
that  depend  on  the  protocol  in  a  uniform  way.  An  order  generator  o  for  a  context  7 
is  a  function  that  associates  with  every  protocol  P  an  order  assignment  <CP  =  o(P)  on 
the  points  of  P+( 7).  A  counterfactual  context  is  a  tuple  (  =  (7,770),  where  o  is  an 
order  generator  for  7.  In  what  follows  we  denote  by  ffc(P,  ()  the  counterfactual  system 
(X+(7, 7r),  o(P)),  where  (  =  (7,77,0);  we  omit  (  when  it  is  clear  from  context. 

We  are  interested  in  order  generators  o  such  that  o(P)  says  something  about  devi¬ 
ations  from  P.  For  the  technical  results  we  prove  in  the  rest  of  the  paper,  we  focus 
on  order  generators  that  prefer  runs  in  which  the  agents  do  not  deviate  from  their  pro¬ 
tocol.  Given  an  agent  i,  action  a,  protocol  P,  context  7,  and  point  (r,  m)  in  P+( 7), 
define  close(i,  a,  P,  7,  (r,  m))  =  {( r' pm )  :  (a)  r'  G  P+(7),  (b)  r'{m')  =  r(m')  for  all 
m!  <  m,  (c)  if  agent  i  performs  a  in  round  m  +  1  of  r,  then  r'  =  r,  (d)  if  agent  i  does 

7Recall  from  the  introduction  that  our  programs  use  the  formula  do(i,  a)  to  state  that  agent  i  is 
about  to  perform  action  a.  Thus,  do(i,  a)  >  ip  says  “if  agent  i  were  to  perform  a  then  p>  would  be  the 
case.”  We  assume  that  all  interpretations  we  consider  give  this  formula  the  appropriate  meaning.  If  the 
protocol  P  being  used  is  encoded  in  the  global  state  (for  example,  if  it  is  part  of  the  environment  state), 
then  we  can  take  do(i,  a)  to  be  a  primitive  proposition.  Otherwise,  we  cannot,  since  its  truth  cannot 
be  determined  from  the  global  state.  However,  we  can  always  take  do(*,  a)  to  be  an  abbreviation  for 
O last(i,  a),  where  the  interpretation  n  ensures  that  last(i ,  a)  is  true  at  a  point  (r,  m)  if  i  performed  a  in 
round  m  of  r.  Since  we  assume  the  the  last  joint  action  performed  is  included  in  the  environment  state, 
the  truth  of  last(i,a)  is  determined  by  the  global  state. 
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not  perform  perform  a  in  round  m  +  1  of  r,  then  agent  i  performs  a  in  round  m  +  1 
of  r'  and  follows  P  in  all  later  rounds,  (e)  all  agents  other  than  z  follow  P  from  round 
m  +  1  on  in  r'}.  That  is,  close(i,  a,  P,  7,  (r,  m))  is  the  set  of  points  (■ r\m )  where 
run  r'  —  r  if  %  performs  a  in  round  m  +  1  of  r;  otherwise,  r'  is  identical  to  r  up 
to  time  m  and  all  the  agents  act  according  to  joint  protocol  P  at  all  later  times, 
except  that  at  the  point  (r',m),  agent  i  performs  action  a.  An  order  generator  o 
for  7  respects  protocols  if,  for  every  protocol  P,  point  (■ r,m )  of  R(P,  7),  action  a,  and 
agent  i,  closest([do(i,  a)],  (r,  m),  JC(P))  is  a  nonempty  subset  of  close(i,  a,  P,  7,  (r,  m)) 
that  includes  (r,  m).  Of  course,  the  most  obvious  order  generator  that  respects  pro¬ 
tocols  just  sets  closest([do(i,  a)],  (r,  m),  J+(P))  =  close(f,  a,  P,  7,  (r,  m)).  Since  our 
results  hold  for  arbitrary  order  generators  that  respect  protocols,  we  have  allowed  the 
extra  flexibility  of  allowing  closest([do(f,  a)],  (r,  m),  J+(P))  to  be  a  strict  subset  of 
close(i,  a,  P,  7,  (r,  m)). 

A  number  of  points  are  worth  noting  about  this  definition: 

•  If  the  environment’s  protocol  Pe  and  the  agents’  individual  protocols  in  P  are  all 
deterministic,  then  close(i,  a,  P,  7,  (r,  m))  is  a  singleton,  since  there  is  a  unique  run 
where  the  agents  act  according  to  joint  protocol  P  at  all  times  except  that  agent  i 
performs  action  a  at  time  m.  Thus,  closest([do(i,  a)],  (r,m),  JC(P))  must  be  the 
singleton  close(i,  a,  P,  7,  (r,  m))  in  this  case.  However,  in  many  cases,  it  is  best  to 
view  the  environment  as  following  a  nondeterministic  protocol  (for  example,  non- 
deterministically  deciding  at  which  round  a  message  will  be  delivered);  in  this  case, 
there  may  be  several  points  in  X  closest  to  (r,  m).  Stalnaker  [1968]  required  there 
to  be  a  unique  closest  world;  Lewis  [1973]  did  not.  There  was  later  discussion  of 
how  reasonable  this  requirement  was  (see,  for  example,  [Stalnaker  1980]).  Thinking 
in  terms  of  systems  may  help  inform  this  debate. 

•  If  process  i  does  not  perform  action  a  at  the  point  (r,  m),  then  there  may  be  points 
in  closest([do(i,  a)],  (r,  m),  JC(P))  that  are  not  in  R(P,  7),  even  if  r  G  R(P,  7). 
These  points  are  “counter  to  fact” . 

•  According  to  our  definition,  the  notion  of  “closest”  depends  on  the  protocol  that 
generates  the  system.  For  example,  consider  a  context  7'  that  is  just  like  the 
context  7  from  Example  2.1,  except  that  S  keeps  track  in  its  local  state,  not  only 
of  the  time,  but  also  of  the  number  of  messages  it  has  sent.  Suppose  that  the 
protocol  Ps  for  S  is  determined  by  the  program 

if  time= 0  then  sendbit  else  skip, 

while  P's  is  the  protocol  determined  by  the  program 

if  #messages=  0  then  sendbit  else  skip. 

Let  P  =  (Ps,SK\Pr)  and  P'  =  (P's,  SKIP#),  where  PR  is  the  protocol  where  R 
does  nothing  (performs  the  action  skip)  in  all  states.  Clearly  R(P, 7')  =  R(P/,7/): 


14 


whether  it  is  following  P5  or  P's,  the  sender  S  sends  a  message  only  in  the  first 
round  of  each  run.  It  follows  that  these  two  protocols  specify  exactly  the  same 
behavior  in  this  context.  While  these  protocols  coincide  when  no  deviations  take 
place,  they  may  differ  if  deviations  are  possible.  For  example,  imagine  a  situa¬ 
tion  where,  for  whatever  reason,  S  did  nothing  in  the  first  round.  In  that  case, 
at  the  end  of  the  first  round,  the  clock  has  advanced  from  0,  while  the  count  of 
the  number  of  messages  that  S  has  sent  is  still  0.  P  and  P'  would  then  pro¬ 
duce  different  behavior  in  the  second  round.  This  difference  is  captured  by  our 
definitions.  If  o  respects  protocols,  then  closest([do(S',  skip)],  (r,  0),  JC(P))  ^ 
closest([do(S',  skip)],  (r,  0),  JC(P')).  No  messages  are  sent  by  S  in  runs  appearing 
in  points  in  closest([do(S',  skip)],  (r,  0),  JC(P)),  while  exactly  one  message  is  sent 
by  S  in  each  run  appearing  in  points  in  closestddo]^,  skip)],  (r,  0),  JC(P')). 

This  dependence  on  the  protocol  is  a  deliberate  feature  of  our  definition;  by  using 
order  generators,  the  order  assignment  we  consider  is  a  function  of  the  protocol 
being  used.  While  the  protocols  P  and  P'  specify  the  same  behavior  in  7,  they 
specify  different  behavior  in  “counterfactual”  runs,  where  something  happens  that 
somehow  causes  behavior  inconsistent  with  the  protocol.  The  subtle  difference 
between  the  two  protocols  is  captured  by  our  definitions. 

3.2  Belief 

As  we  have  just  seen,  in  order  to  allow  for  counterfactual  reasoning  about  a  protocol  P 
in  a  context  7,  our  model  needs  to  represent  “counterfactual”  runs  that  do  not  appear 
in  R(P,  7).  Using  the  counterfactual  system  JC(P),  which  includes  all  runs  of  P+( 7), 
provides  considerable  flexibility  and  generality  in  counterfactual  reasoning.  However, 
doing  this  has  a  rather  drastic  impact  on  the  processes’  knowledge  of  the  protocol  be¬ 
ing  used.  Agents  have  considerable  knowledge  of  the  properties  of  protocol  P  in  the 
interpreted  system  I(P,  7),  since  it  contains  only  the  runs  of  R(P,  7).  For  example,  if 
agent  l’s  first  action  in  P  is  always  b,  then  all  agents  are  guaranteed  to  know  this  fact 
(provided  that  it  is  expressible  in  the  language,  of  course);  indeed,  this  fact  will  be  com¬ 
mon  knowledge,  which  means  agent  knows  it,  for  any  depth  of  nesting  of  these  knowledge 
statements  (cf.  [Fagin,  Halpern,  Moses,  and  Vardi  1995;  Halpern  and  Moses  1990]).  If 
we  evaluate  knowledge  with  respect  to  P+(y),  then  the  agents  have  lost  the  knowledge 
that  they  are  running  protocol  P.  We  deal  with  this  by  adding  extra  information  to  the 
models  that  allows  us  to  capture  the  agents’  beliefs.  Although  the  agents  will  not  know 
they  are  running  protocol  P,  they  will  believe  that  they  are. 

A  ranking  function  for  a  system  1Z  is  a  function  n  :  1Z  — »•  N+,  associating  with  every 
run  of  Pa  rank  tc(r),  which  is  either  a  natural  number  or  00,  such  that  minre-R.  k{t)  =  0.8 

8The  similarity  in  notation  with  the  K-rankings  of  [Goldszmidt  and  Pearl  1992],  which  are  based  on 
Spohn’s  ordinal  conditional  functions  [1988],  is  completely  intentional.  Indeed,  everything  we  are  saying 
here  can  be  recast  in  Spohn’s  framework. 
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Intuitively,  the  rank  of  a  run  defines  the  likelihood  of  the  run.  Runs  of  rank  0  are  most 
likely;  runs  of  rank  1  are  somewhat  less  likely,  those  of  rank  2  are  even  more  unlikely, 
and  so  on.  Very  roughly  speaking,  if  e  >  0  is  small,  we  can  think  of  the  runs  of  rank  k 
as  having  probability  0(ek).  For  our  purposes,  the  key  feature  of  rankings  is  that  they 
can  be  used  to  define  a  notion  of  belief  (cf.  [Friedman  and  Halpern  1997]).  Intuitively, 
of  all  the  points  considered  possible  by  a  given  agent  at  a  point  (r,  m),  the  ones  believed 
to  have  occurred  are  the  ones  appearing  in  runs  of  minimal  rank.  More  formally,  for  a 
point  (r,  m)  define 

min i(r,m)  =  min{«;(r/)  |  r  G  72.(7)  and  r^m7)  =  rj(m)  for  some  m!  >  0}. 

Thus,  min £(r,m)  is  the  minimal  re-rank  of  runs  in  which  rj(m)  appears  as  a  local  state 
for  agent  i. 

An  extended  system  is  a  triple  of  the  form  J  =  (X,  c,  re),  where  (X,  <)  is  a  counter- 
factual  system,  and  re  is  a  ranking  function  for  the  runs  of  X.  In  extended  systems  we 
can  define  a  notion  of  belief.  The  logical  language  that  results  from  closing  CP  (resp.  C) 
under  belief  operators  Bif  for  i  =  1, . . .  ,n,  is  denoted  £%  (resp.  £#).  The  truth  of  B \p 
is  defined  as  follows: 

(X,  c,  re,  r,  m)  \—  B \p  iff  (X,  «c,  re,  r',  m')  |=  p  for  all  (r',  m!)  such  that 

re(r')  =  min*(r,  m)  and  r'(m')  =  rj(m). 

What  distinguishes  knowledge  from  belief  is  that  knowledge  satisfies  the  knowledge 
axiom:  K,p  =>■  ip  is  valid.  While  Bip  =4*  p  is  not  valid,  it  is  true  in  runs  of  rank  0. 

Lemma  3.1  Suppose  that  J  =  ((72, 7 r),  c,  re)  is  an  extended  system,  r  G  72,  and  re(r)  = 
0.  Then  for  every  formula  p  and  all  times  m,  we  have  (J,r,m)  f=  Bip  p. 

Proof  Assume  that  re(r)  =  0.  Thus,  min f(r,m)  =  0  for  all  times  m  >  0.  It  now 
immediately  follows  from  the  definitions  that  if  (J ,  r,  m)  (=  Bip,  then  (J,  r,  m)  \=  p.  I 

By  analogy  with  order  generators,  we  now  want  a  uniform  way  of  associating  with 
each  protocol  P  a  ranking  function.  Intuitively,  we  want  to  do  this  in  a  way  that  lets 
us  recover  P.  We  say  that  a  ranking  function  re  is  P -compatible  (for  7)  if  re(r)  =  0  if 
and  only  if  r  G  R(P,  7).  A  ranking  generator  for  a  context  7  is  a  function  a  ascribing 
to  every  protocol  P  a  ranking  <j(P)  on  the  runs  of  72+(y).  A  ranking  generator  a  is 
deviation  compatible  if  cr{P)  is  P-compatiblc  for  every  protocol  P.  An  obvious  example 
of  a  deviation-compatible  ranking  generator  is  the  characteristic  ranking  generator  cr g 
that,  for  a  given  protocol  P,  yields  a  ranking  that  assigns  rank  0  to  every  run  in  R(P,  7) 
and  rank  1  to  all  other  runs.  This  captures  the  assumption  that  runs  of  P  are  likely 
and  all  other  runs  are  unlikely,  without  attempting  to  distinguish  among  them.  Another 
deviation-compatible  ranking  generator  is  a* ,  where  the  ranking  cr*(P)  assigns  to  a  run  r 
the  total  number  of  times  that  agents  deviate  from  P  in  r.  Obviously,  a *(P)  assigns  r 
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the  rank  0  exactly  if  r  E  R(P,  7),  as  desired.  Intuitively,  a*  captures  the  assumption 
that  not  only  are  deviations  unlikely,  but  they  are  independent.  It  is  clearly  possible  to 
construct  other  P-compatiblc  rankings  that  embody  other  assumptions.  For  example, 
deviation  can  taken  to  be  an  indication  of  faulty  behavior.  Runs  of  rank  k  can  be  those 
where  exactly  k  processes  are  faulty. 

Our  interest  in  deviation-compatible  ranking  generators  is  motivated  by  the  observa¬ 
tion  that  the  notion  of  belief  that  they  give  rise  to  in  X+(7,7r)  generalizes  the  notion  of 
knowledge  with  respcet  to  I(P,  7, 7r).  To  make  this  precise,  define  pB  to  be  the  formula 
that  is  obtained  by  replacing  all  K,  operators  in  p  by  P,;.  (Notice  that  if  p  E  Ck  then 
pB  E  C.B-)  In  addition,  since  ranking  generators  now  play  a  role  in  determining  beliefs, 
we  define  an  interpreted  belief  context  to  be  a  triple  of  the  form  (7,  n,  a). 

Theorem  3.2  Let  a  be  a  deviation- compatible  ranking  generator  for  7.  For  every  for¬ 
mula  p  E  Ck  and  for  all  points  ( r,  m )  oflZ  =  R(P,  7)  and  every  ordering  <  we  have 

(I(P, 7, 7r), r Em)  \=  p  iff  (X+(7,7r),<,a(P),r,m)  h 

Proof  We  proceed  by  induction  on  the  structure  of  p.  For  primitive  propositions, 
the  result  is  immediate  by  definition,  and  the  argument  is  trivial  if  p  is  a  conjunc¬ 
tion  or  a  negation.  Thus,  assume  that  p  is  of  the  form  Kpf.  Let  n  =  <x(P).  Then 
(I(P,  7, 7r),  r,  m)  |=  Kif>  iff  (I(P,  7, 7r),  r',  m')  |=  if  for  all  (r7 pm')  such  that  r'  E  R(P,  7) 
and  r'(m')  =  rf  m).  But  r'  E  R(P,  7)  iff  nfr')  =  0.  Thus,  (I(P,  7,  n),  r,  m)  \=  Kpf  iff 
(X+(7, 7r),  r',  m!)  |=  ifB  for  all  (r7 pm')  such  that  n{r')  =  0  and  r'fm')  =  77(771).  Note  that 
min f(r,m)  =  0  (because  n(r)  =  0).  Thus,  it  easily  follows  that  (I(P, 7, 7r), r, m)  \=  Kpf 
iff  (X+(7,7r),r,m)  |=  BiifB.  | 

In  light  of  Theorem  3.2,  from  this  point  on  we  work  with  the  larger  system  X+(y,  n) 
and  use  belief  relative  to  deviation-compatible  ranking  generators,  instead  of  working 
with  the  system  I(P,  7,  vr)  and  using  knowledge. 

By  having  both  ranking  generators  and  order  generators  in  our  framework,  we  can 
handle  both  belief  and  counterfactual  reasoning.  Thus,  for  example,  we  can  write 
B3(do(l,a)  >  p)  to  represent  agent  3’s  belief  that  if  agent  1  were  to  perform  action  a 
in  the  next  round,  then  p  would  hold.  We  can  further  write  P3(do(l,a)  >  p)  >  if 
to  state  that  were  it  the  case  that  agent  3  had  the  above  belief,  then  in  fact  if  would 
hold.  Arbitrary  nesting  of  belief  and  counterfactuals  is  allowed.  To  take  advantage  of 
the  expressive  features  of  the  framework,  we  now  define  the  analogue  of  knowledge-base 
programs,  to  allow  for  belief  and  counterfactuals. 

A  counterfactual  belief-based  program  (or  ebb  program  for  short)  has  the  same  form  as 
a  knowledge-based  program,  except  that  the  underlying  logical  language  for  the  formulas 
appearing  in  tests  is  now  instead  of  Ck,  and  all  tests  in  the  program  text  Pgj:  for 
agent  i  are  formulas  of  the  form  Biif  or  -1  B{if.  As  with  knowledge-based  programs,  we 
are  interested  in  when  a  protocol  P  implements  a  ebb  program  Pgc6.  Again,  the  idea 
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is  that  the  protocol  should  act  according  to  the  high-level  program,  when  the  tests  are 
evaluated  relative  to  the  counterfactual  belief-based  system  corresponding  to  P.  To  make 
this  precise,  given  an  extended  system  J  =  (X,  <c,  k)  and  a  ebb  program  Pgcfe,  let  Pg^b 
denote  the  protocol  derived  from  Pgc6  by  using  J  to  evaluate  the  belief  tests.  That  is,  a 
test  in  Pgcb  such  as  Blip  holds  at  a  point  (r,  m)  relative  to  J  if  ip  holds  at  all  points  (V,  m') 
in  (X,  k)  such  that  r'(m')  =  77  (m)  and  k{t')  =  min Dehne  an  extended  context  to 
be  a  tuple  (7, 7T,  o,  a),  where  (7, 7r)  is  an  interpreted  context,  o  is  an  ordering  generator 
for  and  a  is  a  deviation-compatible  ranking  generator  for  7.  An  extended  system 


(X,  <c,k)  represents  the  belief-based  program  Pgcft  in  extended  context  (7,770,  a)  if  (a) 
X  =  X+(7,7r),  (b)  C  =  o(PgJ<,K)),  and  (c)  k  =  cr( Pg^,'C,K')).  A  protocol  P  implements 
Pgc6  in  (7, 7r,  o,  a)  if  P  =  Pg^A-AAR^i-P)) .  protocol  P  de  facto  implements  Pgc6  in 


(7, 7T,  a)  if  P 


pa(I+('y  ,n)Ap)Mp)) 

rScb 


There  is  a  close  connection  between  the  notions  of  implementation  for  knowledge- 
based  programs  and  implementation  for  ebb  programs  using  deviation-compatible  rank¬ 
ings.  Given  a  knowledge-based  program  Pgfc6,  we  denote  by  Pgffe  the  program  that  re¬ 
sults  from  replacing  every  knowledge  operator  Rf  appearing  in  Pgkb  to  B ,,  for  all  agents 
i  —  1, ...  ,n.  (This  is,  in  particular,  a  ebb  programs  with  no  counterfactual  operators.) 


Theorem  3.3  Let  Pgkb  be  a  knowledge-based  program  and  let  a  be  a  deviation- compatible 
ranking  generator  for  7 .  Moreover,  let  o  be  an  arbitrary  ordering  generator  for  1Z+  (7) .  A 
protocol  P  de  facto  implements  Pgkb  in  (7,  n)  if  and  only  if  P  de  facto  implements  Pgkb 
in  (7,770,0-). 


Proof  Since  a  is  deviation  compatible,  by  Theorem  3.2,  for  all  points  (r,  m )  of  R (P,  7), 
we  have  that  (I(P,  7,  tt),  r,  m)  |=  p  iff  (X+(7,  tt),  o{P),  a(P),  r,  m)  |=  tpB .  Let  Pgcb  =  Pgf6 
and  let  J(P)  =  (X+(7,7 r),  o(P),cr(P)).  Then 

(pg kb)i(P’7,7r)(ri(m))  =  (Pg cb)f(P)(ri(m))  whenever  r  e  R(P,j).  (1) 

Now  suppose  that  P  de  facto  implements  Pgkb-  By  definition,  P  R7  Pg^6P,7,7r).  Thus,  the 
only  global  states  that  arise  when  running  Pg*feP’7,7r  are  those  of  the  form  r(m )  for  some 
r  G  R(P,  7).  It  easily  follows  from  (1)  that  X(Pg^P,7,7r\  7, 7 r)  =  I(Pg^bF\  7, 7r).  Thus,  P 
de  facto  implements  Pgc6  as  well.  The  argument  in  the  other  direction  is  analogous.  I 

Theorem  3.3  shows  that  a  protocol  P  de  facto  implements  a  knowledge-based  pro¬ 
gram  iff  P  de  facto  implements  the  corresponding  belief-based  program.  Thus,  by  using 
deviation-compatible  rankings,  ebb  programs  can  essentially  emulate  knowledge-based 
programs.  The  move  to  ebb  programs  as  defined  here  thus  provides  what  may  be  con¬ 
sidered  a  conservative  extension  of  the  knowledge-based  framework:  it  allows  us  to  treat 
beliefs  and  counterfactuals,  while  being  able  to  handle  everything  that  the  old  theory 
gave  us  without  changing  the  results. 
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3.3  Analysis  of  the  Bit-Transmission  Problem 

Recall  the  program  BT^  from  the  introduction:  if  Ks(recbit)  then  skip  else  sendbit. 
With  this  program,  S  keeps  sending  the  bit  until  it  knows  that  R  has  received  the 
bit.  As  discussed  in  the  introduction,  it  would  be  even  more  efficient  for  S  to  stop 
sending  the  bit  once  it  knows  that  eventually  R  will  receive  it.  As  we  saw,  replacing 
Ks(recbit)  by  Ks(Orecbit)  leads  to  problems.  We  can  deal  with  these  problems  by  using 
counterfactuals  (and,  thus,  belief  rather  than  knowledge),  as  in  the  ebb  program  BTi) 
from  the  introduction: 

if  Bs(do(S,  skip)  >  Orecbit )  then  skip  else  sendbit. 

This  program  says  that  S  should  send  the  bit  unless  it  believes  that  even  if  it  would 
not  send  the  bit  in  the  current  round,  R  would  eventually  receive  the  bit.  Similarly,  the 
program  BT^b  says  that  S  should  send  the  bit  unless  it  believes  that  R  would  eventually 
correctly  believe  its  value: 

if  Bs(do(S,  skip)  >  OBR{bit))  then  skip  else  sendbit. 

(Recall  that  BR(bit)  is  short  for  ( bit  =  0  A  BR(bit  =  0))  V  ( bit  =  1  A  BR(bit  =  1)).) 

Let  BT>  =  (BTi),  SKIP^)  and,  similarly,  let  BT^  =  (BT|S,  SKIP^).  We  now  consider 
the  implementations  of  BW  and  BT0i;  in  three  different  contexts: 

•  71,  in  which  messages  are  guaranteed  to  be  delivered  within  five  rounds;9 

•  72,  in  which  messages  are  guaranteed  to  arrive  eventually,  but  there  is  no  upper 
bound  on  message  delivery  time;  and 

•  73,  in  which  a  message  that  is  sent  infinitely  often  is  guaranteed  to  arrive,  but  there 
is  no  upper  bound  on  message  delivery  time.  (Nothing  can  be  said  about  a  message 
sent  only  finitely  often;  this  is  a  standard  type  of  fairness  assumed  in  the  literature 
[Francez  1986].) 

In  all  contexts  that  we  consider,  messages  cannot  be  reordered  or  duplicated.  Moreover, 
a  message  can  be  delivered  only  if  it  was  previously  sent.  We  assume  for  now  that  we  are 
working  in  synchronous  systems,  so  that  processes  can  keep  track  of  the  round  number. 
(Indeed,  we  cannot  really  make  sense  out  of  messages  being  delivered  in  five  rounds  in 
asynchronous  systems.)  At  the  end  of  this  section  we  briefly  comment  on  how  our  results 
can  be  modified  to  apply  to  asynchronous  systems.  We  now  describe  these  contexts  more 
formally. 

In  7x  =  (Pf,  Go,  t1,  T1),  an  agent  can  perform  one  of  two  actions:  skip  and  sendbit, 
with  the  obvious  outcome.  The  local  state  of  S  consists  of  three  components:  (a)  a 

9There  is  nothing  special  about  five  rounds  here;  another  other  fixed  number  would  do  for  the  purposes 
of  this  example. 
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Boolean  variable  bit  that  is  fixed  throughout  the  run,  (b)  a  clock  value,  encoded  in  the 
variable  time,  which  is  always  equal  to  the  round  number;  at  a  point  (r,  m)  the  clock 
value  is  m,  and  (c)  the  message  history,  which  is  the  sequence  of  messages  that  S  has  sent 
and  received,  each  marked  by  time  at  which  it  was  sent  or  received.  The  local  state  of  the 
receiver  R  consists  of  the  clock  value  and  R’ s  message  history.  Assume  that  the  set  Gq  of 
initial  states  in  yx  consists  of  two  states — one  in  which  bit  =  0  and  one  in  which  bit  —  1. 
In  both  states  the  clock  values  are  0  and  message  histories  are  empty.  In  this  context, 
messages  are  guaranteed  to  be  delivered  within  at  most  five  rounds.  The  environment 
can  perform  the  action  of  delivering  a  message.  Its  protocol  P}  consists  of  deciding  when 
messages  are  delivered,  subject  to  this  constraint.  Since  the  environment’s  state  keeps 
track  of  all  actions  performed,  it  can  be  determined  from  the  state  which  messages  are 
in  transit  and  how  long  they  have  been  in  transit.  T1  makes  no  restrictions:  all  runs  are 
considered  admissible. 

The  context  72  =  (P2,  Go,  r2,  T2)  is  a  variant  of  7!  with  asynchronous  communication. 
Go  —  Go,  and  the  local  states  of  S  and  R  are  the  same  as  in  71.  Every  message  sent  is 
guaranteed  to  be  delivered,  but  there  is  no  bound  on  the  time  it  will  spend  in  transit. 
Thus,  the  environment’s  state  again  keeps  track  of  the  messages  in  transit,  while  the 
environment’s  protocol  P2  decides  at  each  point  (nondeterministically)  which,  if  any, 
of  the  messages  in  transit  should  be  delivered  in  the  current  round.  The  constraint 
that  messages  are  guaranteed  to  eventually  be  delivered  is  captured  by  the  admissibility 
constraint  T2;  the  set  T2  consists  of  the  runs  in  which  every  message  sent  is  eventually 
delivered. 

The  only  difference  between  73  =  (Pe3,  (y),  r3,  vI/3)  and  72  is  that  the  admissibility 
condition  T3  is  more  liberal  than  (i.e. ,  is  a  superset  of)  T2.  The  set  T3  consists  of  all 
runs  r  that  are  fair  in  the  sense  that,  for  every  time  m,  if  a  given  message  fi  is  sent 
infinitely  often  in  r  after  time  m,  then  at  least  one  of  the  copies  of  p  sent  after  time  m 
is  delivered. 

We  define  three  sets  of  extended  contexts,  extending  7 ,,  i  —  1,2,  3.  Let  PC*  consist  of 
all  contexts  of  the  form  (7 ;,  7 x,o,a),i  —  1,  2,  3,  where  7 r  interprets  the  propositions  bit  =  0 
and  bit  =  1  in  the  natural  way,  o  respects  protocols,  and  a  is  deviation  compatible. 

We  claim  that  both  BT>  and  BTob  solve  the  bit-transmission  problem  in  every  ex¬ 
tended  context  in  ECi,  i  =  1,2,3.  But  what  does  it  mean  for  a  protocol  to  “solve”  the 
bit-transmission  problem?  To  make  this  precise,  we  need  to  specify  the  problem.  In  the 
case  of  the  bit-transmission  problem,  the  specification  is  simple:  we  want  the  receiver  to 
eventually  know  the  bit.  Thus,  we  say  that  a  ebb-program  Pg  solves  the  bit-transmission 
problem  in  extended  context  (  =  (7, 7r,  o,  <j)  if,  for  every  protocol  P  that  de  facto  imple¬ 
ments  Pg,  we  have  that  (J+(P,  Q,  r,  0)  f=  OB^bit)  for  every  run  r  G  R(P,  7).  Notice 
that  using  belief  here  is  safe,  because  we  are  requiring  only  that  the  belief  hold  in  runs 
of  P.  Lemma  3.1  guarantees  that,  in  these  runs  (which  all  have  rank  0),  the  beliefs  are 
true. 

Theorem  3.4  Both  BT3  and  BT  B  solve  the  bit-transmission  problem  in  all  the  ex- 
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tended  contexts  EC\  U  EC2  U  EC:i. 

Proof  Let  (  =  (7,  ir,  o,  a)  be  a  context  in  EC\  U  EC2  U  EC3  and  assume  that  P  de 
facto  implements  BT>  or  BJ  B  in  (.  Let  J  =  (Z(P,  7),  o(P),  cr(P))  and  let  r  G  R(P,  7) 
be  a  run  of  P  in  7.  We  first  consider  the  case  that  P  implements  BW;  the  argument  in 
the  case  that  P  implements  BJbb  is  even  easier,  and  is  sketched  afterwards.  There  are 
two  cases: 

(a)  Suppose  that  ( J,r,m )  |=  Bs(do(S,  skip)  >  O recbit)  for  some  m  >  0.  Since  P  de 
facto  implements  BT>,  S  performs  skip  in  round  m  +  1  of  r.  Thus,  we  have  that 
(. J,r,m )  |=  do(S,  skip).  Since  cr(P)  is  deviation  compatible  and  r  G  R(P,  7), 
it  follows  that  (J^,r,m)  |=  cZo(S',  skip)  >  Orecbit.  Since  o  respects  protocols, 
(■ r,m )  G  closest([do(5',  skip)],  (r,  m),  J).  ft  now  follows  from  the  semantics  of 
>  that  (, J,r,m )  (=  Orecbit.  Since  P  de  facto  implements  BT>,  if  S  sends  a  value 
in  a  run  r'  of  P,  S  is  actually  sending  the  bit.  Since  cr(P)  is  deviation  compatible,  it 
follows  that  in  every  run  r'  of  P,  we  have  that  {J ,  r' ,  m!)  f=  recbit  =>  BR(bit),  since 
all  the  points  in  min R(r',m')  are  points  on  runs  of  P.  Thus,  (, J,r,m )  |=  BR(bit). 

(b)  Suppose  that  ( J,r,m )  \f=-  Bs(do(S,  skip)  >  Orecbit )  for  all  m  >  0.  Since  P  de  facto 
implements  BT>,  it  follows  that  S  sends  the  bit  in  every  round  of  r.  (In  particular, 
the  bit  is  sent  by  S  infinitely  often.)  All  three  contexts  under  consideration  have  the 
property  that  a  message  sent  infinitely  often  is  guaranteed  to  be  delivered.  Thus,  at 
some  time  m!  >  0  in  r,  the  receiver  will  receive  the  bit;  that  is,  (J’,r,m')  |=  recbit 
for  some  m!  >  0.  we  have  by  Then,  just  as  in  part  (a),  it  follows  that  ( J,r,m ')  j— 
BR(bit),  and  hence  that  {J,  r,  0)  f=  O BR(bit). 

The  argument  is  almost  identical  (and  somewhat  simpler)  if  P  implements  BT^’. 
Now  we  split  into  two  cases  according  to  whether  there  is  some  m  such  that  ( J~,  r,  m)  |= 
Bs(do(S,  skip)  >  O BR(bit)).  Using  the  same  arguments  as  above  (but  skipping  the 
argument  that  J  |=  recbit  BR(bit ))  we  get  that,  in  both  cases,  (J,r,  0)  |=  O BR(bit). 


Theorem  3.4,  while  useful,  does  not  give  us  all  we  want.  In  particular,  it  shows 
neither  that  BT>  or  BJob  is  implementable  nor  that  S  sends  relatively  few  messages 
according  to  any  protocol  that  implements  BT>  or  BTob  (which,  after  all,  was  the  goal 
of  using  counterfactuals  in  this  setting).  In  fact,  as  we  now  show,  both  BT5”  and  BPob 
are  implementable  in  all  three  sets  of  contexts,  and  their  implementations  are  as  message- 
efficient  as  possible.  We  consider  each  of  EC\,  EC2, and  EC3  in  turn. 

Intuitively,  in  order  to  solve  the  bit-transmission  problem  in  a  context  in  which  mes¬ 
sages  are  always  delivered,  sending  the  bit  only  once  in  any  given  run  should  suffice. 
Consider  the  collection  of  protocols  P1(/c,  m)  =  {Pg{k,  m),  SKIPR)  for  k,  m  G  N,  where 
PR(k,m )  is  described  by  the  program 

if  ( time  =  k  and  bit  =  0)  or  ( time  =  m  and  bit  =  1)  then  sendbit  else  skip. 
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In  these  protocols,  the  sender  S  sends  its  bit  at  time  k  if  the  bit  value  is  0,  and  at  time  m 
if  it  is  1.  We  now  show  that  all  protocols  of  the  form  P1(k,m)  implement  BT>  in  all 
contexts  in  EC\ : 

Lemma  3.5  The  protocol  Pl(k,m )  de  facto  implements  BT>  in  every  extended  context 
in  EC\ . 

Proof  Fix  k,  m,  and  a  context  (  =  (71,7 t,o,k)  G  EC\ .  We  want  to  show  that 
P1(k,  m )  «71  where  J(k,  m )  =  717),  o(P1(k,  m)),  a(P1(k,  m))).  We 

can  characterize  a  run  consistent  with  P1(yk,  m)  by  the  value  of  bit  and  when  the  one 
message  sent  by  S  is  received.  Let  r^n  be  the  run  where  bit  =  b  and  the  message  is 
received  at  time  n  (clearly  k  +  5>n>k  if  b  —  0  and  m  +  5  >  n  >  m  if  b  =  1).  Clearly 
the  formula  recbit  holds  in  run  r^n  from  time  n  on.  Thus,  Orecbit  holds  at  every  point 
in  every  run  consistent  with  P1(k,  m )  in  the  system  J(kpm).  Note  that  the  runs  ryn  are 
precisely  those  of  rank  0  in  J{k,m). 

We  now  show  that  a  run  r  is  consistent  with  (BT>)^fc,m'1  in  yx  iff  r  =  ryn  for  b  G  {0, 1} 
and  a  value  of  n  satisfying  k  +  5>n>k  if  b  =  0  and  m  +  5>n>mif6=l.  So 
suppose  that  r  is  consistent  with  (BT>)^^,m-)  and  the  value  of  the  bit  in  r  is  0.  It 
suffices  to  show  that  S  sends  exactly  one  message  in  r,  and  that  happens  at  time  k. 
If  n!  ^  k,  then  clearly  (J’(k,m),r,n')  |=  (S',  skip)  >  Orecbit ,  since  the  closest  point 
to  (■ r,n ')  where  cZo(»S',  skip)  holds  is  ( r,n ')  itself.  On  the  other  hand,  if  n!  =  k,  then 
closest([do(S',  skip)],  (r,  n'),  J{k,  m))  =  {(r'0,n')},  where  r'0  is  the  run  where  S  never 
sends  any  messages  and  the  initial  bit  is  0.  In  this  case,  the  properties  of  71  guarantee 
that  no  message  is  ever  received  by  R  in  r',  and  Orecbit  does  not  hold  at  (r/,  k).  It  follows 
that  the  test  Bs(do(S,  skip)  >  Orecbit )  fails  at  (r,  k),  and  r  is  consistent  with  BT>  if  and 
only  if  the  action  sendbit  is  performed  in  round  k  +  1  of  r.  Hence,  r  is  one  of  the  runs 
r0,n  with  k  +  5  >  n  >  k.  A  completely  analogous  treatment  applies  if  bit  =  1  in  r.  We 
thus  have  that  exactly  the  runs  n  described  are  consistent  with  (BT>)'7^fc,m^  in  ji,  and 
hence  P1(k,  m )  de  facto  implements  BT>  in  every  extended  context  in  EC\ ,  as  desired.  I 

In  the  context  71,  there  is  a  fixed  bound  on  message  delivery  time.  As  a  result, 
we  might  hope  to  save  on  message  delivery  in  some  cases.  Suppose  that  we  use  a  one¬ 
sided  protocol,  that  sends  the  bit  only  if  bit  =  0.  Then  the  receiver  should  be  able 
to  conclude  that  the  value  of  the  bit  is  1  if  a  message  stating  the  bit  is  0  does  not 
arrive  within  the  specified  time  bounds.  More  generally,  define  the  collection  of  protocols 
P2(k,b)  =  (Pg(k,b),  SKIPr)  for  b  G  {0,1}  and  k  G  N,  where  Pfj(k,b)  is  the  protocol 
implementing  the  program 

if  time  =  k  and  bit  =  b  then  sendbit  else  skip. 

According  to  Pg(k,  b),  the  sender  S  sends  a  message  only  in  runs  where  the  bit  is  6;  if  the 
bit  is  1  —  b,  it  sends  no  messages.  Moreover,  in  runs  where  the  bit  is  b ,  S  sends  only  one 
message,  at  time  k.  This  type  of  optimization  (sending  a  message  only  for  one  of  the  two 
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bit  values)  was  used  in  the  message-optimal  protocols  of  [Hadzilacos  and  Halpern  1993]; 
it  can  be  used  in  synchronous  systems  in  which  there  is  an  upper  bound  on  the  message 
delivery  time,  as  in  contexts  in  EC\ . 

It  is  easy  to  verify  that  P2(k,b)  does  not  implement  BT>:  Intuitively,  in  a  run  r  of 
P2(k,b )  with  bit  =  1  —  b,  the  sender  S  never  sends  the  bit,  and  hence  Orecbit  does  not 
hold.  Since  S  follows  P2(k,  b)  in  r,  the  formula  do(S,  skip)  holds  at  time  0  in  r.  It  follows 
that  in  evaluating  the  test  Bs(do(S,  skip)  >  Orecbit )  the  closest  point  to  (r,  0)  is  (r,  0) 
itself.  Because  Orecbit  does  not  hold  at  that  point,  the  test  fails,  and  according  to  BT> 
the  sender  S  should  perform  sendbit.  Since,  in  fact,  S  does  not  perform  sendbit  at  (r,  0), 
and  r  is  a  run  of  P2(k,b),  we  conclude  that  P2(k,b)  does  not  implement  BT>.  However, 
as  we  now  show,  P2(k,b )  does  implement  the  more  sophisticated  program  BT/P: 

Lemma  3.6  Every  instance  of  P2(k,b)  de  facto  implements  BT  p  in  every  context  in 
EC  i . 


Proof  Fix  k,  b ,  and  a  context  (  =  (71,7 r,o,  a)  G  EC  1.  We  want  to  show  that 
P2(k,b )  «71  (BTOB)^fc’fe),  where  J{k,b)  =  (J+(7i,  tt),  o(P2(k,  b)),  a(P2(k,  b))).  Note 
that  there  are  exactly  six  runs  consistent  with  P2(k,b)  in  context  71:  five  runs  r™, 
m  =  k  +  1, . . . ,  k  +  5,  where  the  value  of  the  bit  is  b,  the  message  is  sent  in  round  k  +  1 
and  it  arrives  in  round  m;  the  sixth  run  is  ri_b,  where  the  value  of  the  bit  is  1  —  b  and 
no  message  is  sent.  It  is  easy  to  check  that  in  the  extended  system  ff(k,b),  the  formula 
bit  =  b  A  Bji(bit  =  b)  holds  in  runs  r™  from  time  m  on,  while  in  run  the  formula 
bit  —  1  —  b  A  Bft(bit  —  1  —  b)  holds  from  time  k  +  5  on.  Thus,  OB^bit)  holds  at  every 
point  in  the  six  runs  in  R(P2(/c,  b),  71).  Note  that  these  six  runs  are  exactly  the  runs  of 
rank  0. 

We  now  show  that  r  is  consistent  with  (BT05)'7^’^  iff  r  e  R(P2(/c,  6),  yx).  We 
consider  two  cases,  according  to  the  values  of  the  bit  in  r.  First  suppose  that  bit  —  1  —  b 
in  the  run  r.  We  prove  by  induction  on  ml  >  0  that  (a)  if  r  is  consistent  with  BT^’^ 
then  (i)  r{m')  =  ri-bfm')  and  (ii)  (BTob)5' ^k'b\rs(m'))  =  skip,  and  (b)  r\-b  is  consistent 
with  (BTos)^fc’^  up  to  time  ml .  For  the  base  case,  observe  that  r(0)  =  ri_f,(0)  because 
there  is  only  one  initial  state  in  71  with  bit  =  1  —  b.  Clearly  ri_&  is  consistent  with 
(BTob)^,6)  up  to  time  0.  Thus,  parts  (a)(i)  and  (b)  hold.  For  part  (a) (ii) ,  to  see  that 
(BTos)^fc’^(rs(0))  =  skip,  it  suffices  to  show  that  (J(k,  b),  r,  0)  |=  Bs(do(S,  skip)  > 
O Bn(bit)).  Since  07  is  deviation  compatible  and  S  knows  that  bit  —  1  —  6,  it  follows 
that  min^P  ^k’b^(r,0)  =  { 0) } .  Thus,  it  suffices  to  show  that  (C(k,  b),  r^,  0)  |= 
do(S,  skip)  >  OBn(bit).  But  this  is  immediate  from  the  fact  that  {J{h,  b),  ri-b,  0)  |= 
do(S,  skip)  and,  as  observed  earlier,  that  {J{k,  b),  r\-b,  0)  \=  OB^bit). 

For  the  inductive  step  in  the  case  bit  =  1  —  b,  assume  that  the  inductive  claim  holds 
for  time  ml  >  0.  We  want  to  show  that  it  holds  at  time  ml  +  1.  Part  (a)(i)  and  (b)  are 
immediate  from  the  inductive  hypothesis.  The  argument  for  part  (a)  (ii)  is  the  same  as  in 
the  base  case.  This  completes  the  inductive  argument.  It  follows  immediately  from  the 
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induction  that  r\-b  is  consistent  with  BT^  and  that  if  r  is  consistent  with 
and  bit  =  1  —  6  in  r,  then  r  =  r\_b. 

Now  consider  the  case  where  bit  =  b  in  r.  Define  b-runs  to  be  the  set  {r^+i,  rk+2  •  •  • ,  'D+s}, 
and  b-pts(m')  to  be  {(r^+i,  m1),  (rk+2,  m'), . . . ,  (rk+5,  m')}.  We  show  by  induction  on 
m'  >  0  that  if  r  is  consistent  with  (BT  then 


(a)  r{m!)  G  b-pts(m'), 

(b)  (BT|'B)J’(fc’ft)(r5(m/)) 


skip  if  ml  ^  k 
sendbit  if  m'  —  k, 


(c)  at  least  one  run  in  b-runs  agrees  with  r  up  to  time  ml ;  moreover,  if  ml  >  k  +  5, 
then  exactly  one  run  in  b-runs  agrees  with  r  up  to  time  m! . 


For  the  base  case,  it  is  again  immediate  that  r(0)  G  b-pts(O)  and  that  all  runs  in  b-runs 
agree  with  r  up  to  time  0.  To  see  that  part  (b)  holds,  first  note  that  min^P  ^k’b^(r,0)  = 
{(rk  ,0)  :  k'  —  1, ...  ,5}.  There  are  now  two  cases:  if  k  —  0  (so  that  S  sends  a  mes¬ 
sage  in  round  1  of  all  the  runs  in  b-runs),  then  we  must  show  that  (J{k,b),r,  0)  |= 
~^Bs(do(S,  skip)  >  O BR(bit)),  so  that  BT^fc’^(rs(0))  =  sendbit.  Note  that,  if  k  —  0, 
then  closest([do(S', skip)],  (rk\  0),  J{h,  b))  =  {r*}  for  k'  —  1, ...  ,5,  where  r*  is  the  run 
where  bit  =  b  and  no  messages  are  ever  sent  by  S  or  R.  Thus,  it  suffices  to  show  that 
(i7(/c,  b),  r*,  0)  |=  -i OBR(bit).  It  is  easy  to  see  that,  since  is  deviation  compatible, 
we  must  have  (ri_b,m)  G  min)^  ,m),  for  all  m  >  0.  Thus,  bit  = 

l  —  b/\BR(bit  =  1  —  6)  for  all  m  >  0,  and  hence  (J ',r*,m')  \=  ~^OBR{bit)  for  all  m!  >  0,  as 
desired.  On  the  other  hand,  if  k  >  0,  we  must  show  that  ( J (k,  b),r,  0)  |=  Bg(do(S,  skip)  > 
OBR{bit)).  Note  that  if  k  >  0,  then  closestddo^,  skip)],  {rk\  0),  J(k,  6))  =  {rk'},  for 
k’  =  1, . . . ,  5.  Since  (J(k,  6),  rk\  0)  |=  do(S,  skip)  A  BR(bit)),  we  are  done. 

The  argument  in  the  inductive  step  is  almost  identical,  except  that  it  now  breaks  into 
the  cases  m!  <  k,  ml  —  k,  k  <  m'  <  k  +  5,  and  ml  >  k  +  5.  We  leave  details  to  the 
reader. 

Finally,  we  must  show  that  each  run  r  G  b-runs  is  consistent  with  (BT()B)^fc’feb  We 
proceed  by  induction  on  ml  to  show  that  r  is  consistent  with  (BTg5)^’^  up  to  time  m'. 
This  involves  proving  part  (b)  of  the  induction  above  for  each  r  G  b-runs.  The  proof  is 
similar  to  that  above,  and  left  to  the  reader.  I 

The  preceding  discussion  has  shown  that  P2(k,b )  implements  BTos,  but  not  BT>,  in 
contexts  in  EC\ .  Lemma  3.5  shows  that  P1(k,  m )  implements  BT>  in  contexts  in  EC\ . 
An  obvious  question  is  whether  P^/qm)  implements  BT  ^  in  contexts  in  EC\ .  We  now 
show  that  if  k  ^  m,  then  P1(Kk,  m)  does  not  implement  BT  B;  if  k  —  m,  then  whether 
P1(k,  m)  implements  BTob  depends  on  what  the  receiver  believes  in  runs  where  he  does 
not  receive  a  message.  Since  there  is  no  run  of  Pl{k,m )  where  the  receiver  receives  no 
messages,  this  is  not  determined  by  just  assuming  that  we  have  a  deviation-compatible 
ranking  generator.  Given  a  ranking  k,  let  k(ti,  6)  be  the  rank  of  the  run  with  least  rank 


24 


where  (a)  the  receiver  does  not  receive  any  messages  up  to  and  including  time  n  and  (b) 
the  bit  has  value  b.  We  say  that  a  ranking  k  is  biased  if  k ;(n,  0)  7^  n(n,  1)  holds  for  at  least 
one  time  instant  n.  Note  that  if  n(n,  i )  <  K,(n,  i  ©  1)  then,  in  the  absence  of  messages,  R 
will  believe  that  the  bit  is  i  at  time  n. 

Lemma  3.7  Let  (  =  (71, 7r,  o,  a)  G  EC\.  The  protocol  P1(k,m)  de  facto  implements 
BTos  in  (  exactly  if  both  (a)  k  =  m  and  (b)  o(P1(k,  k ))  is  not  biased. 

Proof  Fix  a  context  (  =  (7i,7r,o,  cr)  G  EC\ .  As  in  the  proof  of  Lemma  3.5,  define 
J(kpm)  =  (X+ (71, 7r),  o(P1(/c,  m)),  <r(P1(A;,  m)))  and  the  runs  r^n- 

First  suppose  that  o(P1(k,  k))  is  not  biased.  We  show  that  P1(k,k )  de  facto  imple¬ 
ments  BT0i;  in  (.  By  definition,  in  each  of  the  ten  runs  rbtn  of  rank  0  in  the  extended 
system  J(k,  k ),  recbit  holds  at  the  time  n  when  the  receiver  R  receives  the  bit.  Since  R 
receives  the  correct  bit,  it  is  easy  to  see  that  in  fact  (ff(k,  k),  rbtU,n)  |=  BR(bit).  Thus, 
O Bft(bit)  holds  at  every  point  in  the  ten  runs  of  the  form  rb,n  in  the  system  (BTv  B)^^’fcb 
Moreover,  (ff(k,  k),  rb}Tl,  in)  |=  do(S,  skip)  >  O  BR(bit)  for  m  7^  k.  Since  the  runs  rb<n 
are  the  runs  of  rank  0,  it  actually  follows  that  (ff(k,k),rbtn,rn)  |=  Bs(do(S,  skip)  > 
O Bft(bit))  for  m  7^  k.  We  now  show  that  (J  (k,  k),rb,n,  k)  f=  ~>B s (do (S,  skip)  >  <>BR(bit )) 
Note  that  closest  ([do^,  skip)],  (r\n,  k ),  ff(k,  k ))  =  {(rb,  k )},  where  r'b  is  the  run  where 
the  bit  is  b  and  S  sends  no  messages.  Suppose  that  (ff(k,  k),  r'b,  k)  |=  O BR(bit  =  b). 
Thus,  there  is  some  n  >  k  such  that  (J(k,k),r'b,n)  \=  BR(bit  =  b).  Then  we  must 
have  hi(n,b)  <  n(n,  b  ©  1),  so  that  k  is  biased,  contradicting  the  assumption.  Thus, 
(■ J(k,k),rb,k )  |=  -1O BR(bit  =  0),  so  (J(k,  k),  rb,n,  k)  \=  ~^Bs(do(S,  skip)  >  O BR(bit)), 
as  desired.  In  this  case,  by  (BT  the  sender  S  should  perform  sendbit  at  time  k. 

It  follows  that  rb:n  is  consistent  with  (BTos)'7(fe’feb 

We  next  show  that  if  r  is  consistent  with  (BT  /B)J<-k’k\  then  r  G  {rb,n  '■  b  —  0, 1,  n  — 
k  +  1, . . . ,  k  +  5}.  So  suppose  that  the  bit  is  0  in  r  and  that  r  is  consistent  with 
(BT^ Just  as  in  the  proof  of  Lemma  3.6,  it  is  easy  to  show  by  induction  on 
m  that  no  messages  are  sent  in  r  at  time  m  <  k:  It  is  easy  to  see  that  (J(h,  k),r,m )  |= 
Bs(do(S,  skip)  >  O BR(bit))  for  k  <  m,  since  ( r,m )  (r(,)n,m).  Just  as  in  the  case  of 

rb,n,  we  can  show  that  (J(k,  k),  r,  k)  |=  ~^Bs(do(S,  skip)  >  O BR(bit)).  Thus,  since  r  is 
consistent  with  (BTos)^^fe,fe\  the  sender  S  sends  a  message  at  time  k  in  r.  It  is  easy  to 
show  that  S  does  not  send  the  bit  after  time  k ;  we  leave  details  to  the  reader.  Thus,  if  r 
is  consistent  with  (BT^ then  S  sends  the  bit  in  r  at  time  k  (and  does  not  send  it 
at  any  other  time),  so  r  is  of  the  form  rb)n- 

We  next  claim  that  if  k  7^  m  then  P1(k,  m)  does  not  de  facto  implement  BT  B  in  £. 
Without  loss  of  generality,  suppose  that  k  <  m.  By  the  properties  of  71,  messages 
can  take  up  to  five  time  units  to  be  delivered.  Hence,  there  is  a  run  of  P1(k,  m )  with 
bit  =  1  in  which  the  sender’s  message  is  not  delivered  by  time  m  +  4.  However,  because 
k  <  m,  there  is  no  run  with  bit  =  0  where  no  message  is  delivered  by  time  m  +  4. 
Because  o  is  deviation  compatible,  it  follows  that  n(m  +  4, 1)  =  0  <  n(m  +  4,  0).  Thus, 
(J(k,m),rljm+4,m  +  4)  |=  BR(bit  =  1),  so  (J(k,m),ri>tn+j,m)  \=  Bs(do(S,  skip)  > 
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O BR(bit))  for  j  =  1, ...  ,5.  Therefore,  S  should  not  send  the  bit  at  time  m  according 
to  (BT OBy^m)  in  runs  where  the  bit  is  1,  showing  that  P1(k,m)  does  not  de  facto 
implement  BT05. 

To  complete  the  proof  of  the  lemma,  we  need  to  show  that  if  k  —  a(P1( k,k))  is 
biased,  then  P1(k,  k)  does  not  implement  BT  'b  in  (.  So  suppose  that  k  =  a(P1(k,  k)) 
is  biased.  Since  k  is  biased,  there  is  an  n  for  which  n(n,  0)  ^  n(n,  1).  Without  loss  of 
generality,  assume  that  n(n,  0)  <  n(n,  1).  We  must  have  n  >  k,  since  k(£,  0)  =  k(£,  1)  =  0 
for  all  i  <  k,  because  in  all  runs  consistent  with  P1(A;,/c),  the  receiver  R  receives  no 
messages  up  to  time  i.  It  follows  that  ( ff(k,k),r,k )  |=  O BR(bit  =  0)  for  all  runs  r 
consistent  with  P1(k,  k).  Thus,  (ff(k,k),r0tk+j,m)  f=  Bs(do(S,  skip)  >  O BR(bit))  for 
j  —  1, . . . ,  5.  It  follows  that,  in  runs  where  the  bit  is  0,  S  should  not  send  the  bit  according 
to  ( This  again  establishes  that  P1(k,  k )  does  not  de  facto  implement  BTV >B . 
I 


Now  consider  the  context  72.  Here  there  is  no  upper  bound  on  message  delivery  times. 
As  a  result,  S  must  send  R  messages  regardless  of  what  bit  value  is. 

Lemma  3.8  Every  instance  of  P1(k,m)  de  facto  implements  both  BT  >  and  BJ'b  in 
every  context  in  EC2. 

Proof  The  proof  for  the  case  of  BT>  is  identical  to  the  proof  given  for  contexts  in  EC\ 
in  Lemma  3.5.  There  are  now  infinitely  many  runs  '/y.n  consistent  with  P1(k,  m )  rather 
than  ten  runs,  but  the  argument  remains  sound.  We  leave  details  to  the  reader. 

In  the  case  of  BTob,  the  argument  follows  the  same  lines  as  the  proof  Lemma  3.5, 
except  that  the  role  of  Orecbit  is  now  played  by  O BR(bit).  Fix  k,  m,  and  a  context  (  = 
(72, 7T,  o,  a)  e  EC2.  We  want  to  show  that  P1(k,  m)  ^72  where  Jfk,  m)  = 

(Z+(72,7 T),o(P1(k,m)),a(P1(k,m))).  It  is  easy  to  check  that  in  the  extended  system 
fT(k,m),  the  formula  B^bit  =  b )  holds  in  run  r\,n  from  time  n  on.  Thus,  OBn(bit) 
holds  at  every  point  in  every  run  consistent  with  P1(k,m)  in  the  system  J'[kpm).  Note 
that  the  runs  r &jfl  are  precisely  those  of  rank  0  in  J'fkpm).  Finally,  note  that  if  ( r',  n ) 
is  an  arbitrary  point  in  J’fkpm)  with  n  >  ma x(k,m)  and  no  messages  are  sent  in  rJ  up 
to  time  n,  then  (J'{h,m),r' ,n )  |=  -1  Bn(bit  =  0)  A  ->Bn(bit  =  1),  since  there  are  runs 
consistent  with  P1(/c,  m)  where  no  messages  arrive  up  to  time  n  and  the  bit  can  be  either 
0  or  1;  for  example,  (r0>n+i,n)  ~R  (■ r\n )  and  (rhn+1,n)  ~R  (■ r\n ). 

We  now  show  that  a  run  r  is  consistent  with  (BTX/'s)Z(fc,rrd  jn  iff  r  —  r-bn  for 
b  G  {0, 1}  and  n  >  0.  So  suppose  that  r  is  consistent  with  (BT05)'7' (k’ml  and  the 
value  of  the  bit  in  r  is  0.  It  suffices  to  show  that  S  sends  exactly  one  message  in  r, 
and  that  happens  at  time  k.  The  argument  is  very  similar  to  that  in  Lemma  3.6.  If 
n  <  k,  then  clearly  ( J’l(k,m),r,n )  |=  (S',  skip)  >  OBR{bit),  since  the  closest  point 
to  ( r,n )  where  do(S,  skip)  holds  is  (r,n)  itself.  On  the  other  hand,  if  n  =  k,  then 
closest([do(S',  skip)],  (r,  n),  J'{k,  m))  =  {(rg,n)},  where  r'Q  is  the  run  where  S  sends 
no  messages  and  the  initial  bit  is  0.  As  observed  earlier,  we  have  (, J\k,m),r'b,n )  |= 
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n(-,BR(bit  =  0)  A  ->BR(bit  =  1)),  so  (J'(k,m),rbtn,n)  (=  ->(do(S,  skip)  >  BR(bit )). 
Thus,  since  r  is  consistent  with  (BTos)'T(fc,m)  [n  ^  g  sends  its  bit  at  time  k  in  r. 
Finally,  if  n  >  k,  again  we  have  closest([do(<S',  skip)],  (r,  n),  J\k,  m))  =  {(r,  n)}  so, 
again,  S  does  not  send  a  message  at  time  n  in  r.  Thus,  r  has  the  form  r o,n'  f°r  some  n! . 
The  same  argument  shows  that  all  runs  of  the  form  r0y  are  in  fact  consistent  with 
(BTx>B)^fc’m).  The  argument  if  6=1  is  identical  (with  m  replacing  k  throughout).  | 

Finally,  we  consider  the  contexts  in  EC%.  In  this  case,  communication  is  such  that 
if  R  sends  no  messages,  then  S  is  guaranteed  to  have  one  of  its  messages  delivered  only  in 
case  it  sends  infinitely  many  message.  This  says  that  if  we  consider  only  protocols  of  the 
form  ( Pg ,  SKIP/j),  then  S  rnnst  send  infinitely  many  messages  in  every  run.  However,  if  a 
protocol  sends  infinitely  many  messages,  then  no  particular  one  is  necessary;  if  S  does  not 
send,  say,  the  first  message,  then  it  still  sends  infinitely  many,  and  R  is  guaranteed  to  get 
a  message.  This  suggests  that  we  will  have  difficulty  finding  a  protocol  that  implements 
BT>  or  BT°_b.  rjjie  f0}[0Wjng  proposition  prevides  further  evidence  of  this.  If  7  C  TV  (the 
natural  numbers),  let  P(I)  =  (Ps(7),  SKIPR),  where  Pg(7)  is  described  by  the  program 

if  time  G  7  then  sendbit  else  skip. 

Thus,  with  PS(I),  the  sender  S  sends  the  bit  at  every  time  that  appears  in  I. 

Proposition  3.9  No  protocol  of  the  form  P(7 )  de  facto  implements  either  BT>  or  BTob 
in  any  context  in  EC3. 

Proof  We  sketch  the  argument  here  and  leave  details  to  the  reader.  First  suppose 
that  7  is  finite.  Let  r  be  a  run  in  P(I)  where  none  of  the  finitely  many  messages  sent 
by  S  is  received.  Let  n  =  sup (7)  +  1.  Suppose  that  (73, 7r,  o,  a)  G  EC5.  Let  J{1)  = 
(J+(73,vr),o(P(/)),a(P(/))).  Clearly,  closest([do(5',  skip)],  (r,  n),  J{I))  =  {(r)},  since 
S  performs  the  act  skip  at  (r,n).  However,  since  R  never  receives  the  bit  in  run  r,  and 
cr(P(/))(r)  =  0,  it  follows  that  ( J(I),r,n )  |=  -<Orecbit  and  ( J(I),r,n )  |=  ->BR(bit). 
Thus,  according  to  both  BW  and  BTv  S,  S  should  send  a  message  at  (r,  n).  It  follows 
that  P(7)  does  not  implement  BT>  or  BT0iL 

Now  suppose  that  I  is  infinite.  The  properties  of  73  ensure  that  R  does  in  fact  receive 
the  bit  in  every  run  of  P(7).  Moreover,  it  is  easy  to  check  that  when  the  message  is 
received,  both  recbit  and  BR(bit)  hold.  Hence,  for  any  given  clock  time  m  G  7,  the 
formulas  c/o(<S',  skip)  >  Orecbit  and  do(S,  skip)  >  O BR(bit)  hold  at  time  m  in  all  runs  of 
the  protocol.  A  straightforward  argument  shows  that  sendbit  is  neither  compatible  with 
BT>  nor  with  BTos  at  time  m.  I 

Intuitively,  Proposition  3.9  is  a  form  of  the  “procrastinator’s  paradox”:  Any  action 
that  must  be  performed  only  eventually  (e.g.,  washing  the  dishes)  can  always  safely  be 
postponed  for  one  more  day.  Of  course,  using  this  argument  inductively  results  in  the 
action  never  being  performed. 
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Despite  Proposition  3.9,  we  now  show  that  BT>  and  BT^8  are  both  implcmentablc 
in  all  contexts  in  EC3.  Let  P u  =  (Pg,SKIP#),  where  P$  is  the  protocol  determined  by 
the  following  program: 

if  time  =  0  or  sendbit  was  performed  in  the  previous  round,  then  sendbit  else  skip. 

Since  S' s  local  state  contains  both  the  current  time  and  a  record  of  the  time  at  which 
it  sent  every  previous  message,  it  can  perform  the  test  in  PU(S).  It  is  not  too  hard  to 
see  that  is  de  facto  equivalent  to  P(IV)  in  73 — under  normal  circumstances  the  bit 
is  sent  in  each  and  every  round.  The  two  protocols  differ  only  in  their  counterfactual 
behavior.  As  a  result,  while  P(1N )  implements  neither  BT>  nor  BT08,  the  protocol  Pu 
implements  both. 

Lemma  3.10  P w  de  facto  implements  both  BT>  and  BT  8  in  every  context  in  EC3. 

Proof  We  provide  the  proof  for  BT  y8.  The  proof  for  BT>  is  similar,  and  left  to  the 
reader. 

Fix  a  context  (3  =  (73,770,0-)  G  EC3.  We  want  to  show  that  Pu  ~73  (BT0-8)^, 
where  =  (J+(73, 7r),  o{Pw),  a(PUJ)).  Let  FE  =  R(PA  73).  Note  that,  for  every  natural 
number  k ,  there  are  runs  77^  G  R u  in  which  bit  =  b  and  no  message  that  is  sent  by  S  in 
the  first  k  rounds  is  ever  delivered  to  R.  It  follows  that  if  R  has  received  no  message  by 
time  m  in  run  r  of  P^,  then  (jPVc  m)  |=  -1  B^(bit). 

We  now  prove  by  induction  on  k  that  a  run  r  is  consistent  with  (BT  B in  73  for  k 
rounds  exactly  if  S  has  performed  sendbit  in  each  of  the  first  k  rounds  of  r.  The  base  case 
for  k  =  0  is  vacuously  true.  For  the  inductive  step,  assume  that  the  claim  is  true  for  k  =  l. 
Suppose  that  r  is  consistent  with  (BTV  for  £+1  rounds.  By  the  induction  hypothesis, 
the  sender  S  has  performed  sendbit  in  each  of  the  first  £  rounds.  Since  r  is,  by  assumption, 
consistent  with  (BT88)8""  for  f  +  1  rounds,  S  performs  sendbit  in  round  £  +  1  of  r  exactly 
if  \=  ->Bs{do(S,  skip)  >  O BR{bit)).  Let  bit  =  b  in  r.  Moreover,  a3(PUJ)(r)  =  0 

since  a3  is  deviation  compatible.  Clearly  (r,£)  (rb,e),  where  77  G  R w  is  the  run 

constructed  earlier  where  none  of  the  message  sent  by  S  in  the  first  £  rounds  arrive,  since 
in  both  r  and  7,7,  the  bit  is  the  same  and  S  sends  a  message  in  each  of  the  first  £  rounds. 
Moreover,  a(PUJ)(ri>tf)  =  0,  since  a  is  deviation  compatible  and  ryj  G  R A  Thus,  to  show 
that  (, J'tJ,r,£ )  (=  ~^Bs(do(S,  skip)  >  O BR(bit)),  it  suffices  to  show  that  (J'UJ,rb^,£)  \— 
->(do(S,  skip)  >  O BR{bit)).  The  points  in  closest ([do^,  skip)],  (r,  have  the  form 

(r\  £)  where  r’  agrees  with  rb/  up  to  and  including  time  £,  S  does  nothing  in  round  £  of  r', 
and  S  follows  Pu  in  all  rounds  after  £  in  rJ .  The  key  point  here  is  that,  by  following  Pu, 
S  sends  no  messages  in  rJ  after  round  £.  Consequently,  in  all  runs  appearing  in  this  set  of 
closest  points,  S  sends  a  finite  number  of  message  (exactly  £,  in  fact).  By  the  admissibility 
condition  T3  of  73,  there  is  one  run  in  this  set,  which  we  denote  by  r,  in  which  R  receives 
no  messages.  Note  that  (■ r,n )  (ro,n,  n)  and  (r,n)  (ri>n,n),  since  in  all  of  r,  r0,n 

and  ritTl,  the  receiver  R  receives  no  messages  up  to  time  n.  Since  both  ro,n  and  r \ <n  are  in 


R w,  it  follows  that  they  both  have  rank  0.  Thus,  (ffu,f,n)  \=  -> Br{bit).  That  is,  BR(bit ) 
never  holds  in  f.  It  follows  that  {Jw  ir^t)  \=  -<(do(S,  skip)  >  O BR(bit)),  as  needed.  We 
can  thus  conclude  that  r  is  consistent  with  [BJ  'B)J"  in  73  for  i  +  1  rounds  exactly  if  S 
performs  send  bit  in  the  first  t  +  1  rounds,  and  we  are  done.  | 

Lemma  3.10  shows  one  way  of  resolving  the  procrastinator  paradox:  If  one  decides 
that  an  action  (e.g.,  washing  the  dishes)  that  is  not  performed  now  will  never  be  per¬ 
formed,  then  performing  it  becomes  critical.  (We  are  ignoring  the  issue  of  how  one  can 
“decide”  to  use  such  protocol.  In  the  context  of  distributed  computing,  we  can  just  make 
this  the  protocol;  people  are  likely  not  to  believe  that  this  is  truly  the  protocol.)  In  any 
case,  using  such  a  protocol  makes  performing  the  action  consistent  with  the  procrastina¬ 
tor’s  protocol  of  doing  no  more  than  what  is  absolutely  necessary. 

We  can  summarize  our  analysis  of  implementability  of  BT>  and  BTob  by  the  following 
theorem: 

Theorem  3.11  Both  BT>  and  BTob  are  de  facto  implementable  in  every  extended 
context  in  EC\  U  EC2  U  EC3  Moreover,  if  P  de  facto  implements  BT>  or  BTos  in  a 
context  (  G  EC  1  U  EC 2,  then  S  sends  at  most  one  message  in  every  run  consistent  with 
P  in  (. 

Proof  The  implementability  claims  follow  from  Lemmas  3.5,  3.6,  and  3.10.  We  now 
prove  that  S  sends  no  more  than  one  message  in  every  run  of  a  protocol  that  de  facto 
implements  BT>  or  BJylj  in  a  context  in  EC\  U  EC2.  Suppose  that  P  =  ( Ps,Pr )  de 
facto  implements  BT^  in  (  =  (7,7 r,  o,  a)  G  EC\  U  EC2.  Further  suppose,  by  way  of 
contradiction,  that  there  is  a  run  r  consistent  with  P  in  7  in  which  the  sender  sends 
more  than  one  message.  Suppose  that  the  second  message  is  sent  at  time  k ,  and  the 
value  of  the  bit  in  r  is  b.  Let  J  =  (Z+(y,  n),  o(P),  <t(P)).  Since  7  G  {71,72},  all 
messages  are  guaranteed  to  arrive  eventually  in  the  context  7.  Thus,  it  is  easy  to  see 
that  ( J,r,k )  |=  Bs(OBR(bit  =  b )).  It  follows  that  ( J,  r,k )  |=  do(S,  skip)  >  BR{bit). 
Since  P  is  de  facto  consistent  with  BT>,  this  means  that  S  should  not  send  a  message  at 
(r,  k).  This  is  a  contradiction.  | 

All  the  contexts  we  have  considered  are  synchronous;  the  sender  and  receiver  know 
the  time.  As  we  observed  earlier,  there  is  no  analogue  of  71  in  the  asynchronous  setting, 
since  it  does  not  make  sense  to  say  that  messages  arrive  in  5  rounds.  However,  there  are 
obvious  analogues  of  72  and  73.  Moreover,  if  we  assume  that  S’s  local  state  keeps  track 
of  how  many  times  it  has  been  scheduled  and  what  it  did  when  it  was  scheduled,  then 
the  analogue  of  P2(k,m )  implements  both  BT>  and  BTV^  if  messages  are  guaranteed  to 
arrive  (where  now  P2(k,  m )  means  that  if  bit  =  0,  then  the  kth  time  that  S  is  scheduled 
it  performs  sendbit,  while  if  bit  =  1,  then  the  mth  time  that  S  is  scheduled  it  performs 
send  bit).  Similarly,  the  analogue  of  P "  implements  both  BT>  and  BT0ij>  in  contexts  that 
satisfy  the  fairness  assumption  (but  any  finite  number  of  messages  may  not  arrive). 
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4  Discussion 


This  paper  presents  a  framework  that  facilitates  high-level  counterfactual  reasoning  about 
protocols.  Indeed,  it  enables  the  design  of  well-defined  protocols  in  which  processes  act 
based  on  their  knowledge  of  counterfactual  statements.  This  is  of  interest  because,  in 
many  instances,  the  intuition  behind  the  choice  of  a  given  course  of  action  is  best  thought 
of  and  described  in  terms  of  counterfactual  reasoning.  For  example,  it  is  sometimes  most 
efficient  for  agents  to  stop  exending  resources  once  they  know  that  their  goals  will  be 
achieved  even  if  they  stop.  Making  this  precise  involves  counterfactual  reasoning;  this 
agent  must  consider  what  would  happen  were  it  to  stop  expending  resources. 

This  paper  should  perhaps  best  be  viewed  as  a  “proof  of  concept”;  the  examples 
involving  the  bit-transmission  program  show  that  counterfactuals  can  play  a  useful  role  in 
knowledge-based  programs.  While  we  have  used  standard  approaches  to  giving  semantics 
to  belief  and  counterfactuals  (adapated  to  the  runs  and  systems  framework  that  we  are 
using),  these  definitions  give  the  user  a  large  number  of  degrees  of  freedom,  in  terms 
of  choosing  the  ranking  function  to  define  belief  and  the  notion  of  closeness  needed 
to  define  counterfactuals.  While  we  have  tried  to  suggest  some  reasonable  choices  for 
how  the  ranking  function  and  the  notion  of  closeness  are  defined,  and  these  choices 
certainly  gave  answers  that  matched  our  intuitions  in  all  the  context  we  considered  for 
the  bit-transmission  problem,  it  would  be  helpful  to  have  a  few  more  examples  to  test  the 
reasonableness  of  the  choices.  We  are  currently  exploring  the  application  of  ebb  programs 
for  analyzing  message-efficient  leader  election  in  various  topologies;  we  hope  to  report  on 
this  in  future  work. 

While  we  used  the  very  simple  problem  of  bit  transmission  as  a  vehicle  for  introduc¬ 
ing  our  framework  for  knowledge,  belief,  and  counterfactuals,  we  believe  it  should  be 
useful  for  handling  a  much  broader  class  of  distributed  protocols.  We  gave  an  example 
of  how  counterfactual  reasoning  is  useful  in  deciding  whether  a  message  needs  to  be  sent. 
Similar  issues  arise,  for  example,  in  deciding  whether  to  perform  a  write  action  on  a 
shared-memory  variable.  Because  our  framework  provides  a  concrete  model  for  under¬ 
standing  the  interaction  between  belief  and  counterfactuals,  and  for  defining  the  notion 
of  “closeness”  needed  for  interpreting  counterfactuals,  it  should  also  be  useful  for  illu¬ 
minating  some  problems  in  philosophy  and  game  theory.  The  insight  our  analysis  gave 
to  the  procrastinator’s  paradox  is  an  example  of  how  counterfactual  programs  can  be 
related  to  issues  in  the  philosophy  of  human  behavior.  We  believe  that,  in  particular,  the 
framework  will  be  helpful  in  understanding  some  extensions  of  Nash  equilibrium  in  game 
theory.  For  example,  as  we  saw  in  Lemma  3.7,  whether  a  protocol  de  facto  implements 
a  ebb  program  depends  on  the  agent’s  beliefs.  This  seems  closely  related  to  the  notion 
of  a  subjective  equilibrium  in  game  theory  [Kalai  and  Lehrer  1995].  We  are  currently 
working  on  drawing  a  formal  connection  between  our  framework  notions  of  equilibrium 
in  game  theory.  It  would  also  be  interesting  to  relate  the  notion  of  “closeness”  defined  in 
our  framework  to  that  given  by  the  structural-equations  model  used  by  Pearl  [2000]  (see 
also  [Halpern  2000]).  The  structural-equations  model  also  gives  a  concrete  interpretation 
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to  “closeness”;  it  does  so  in  terms  of  mechanisms  defined  by  equations.  It  would  be 
interesting  to  see  if  these  mechanisms  can  be  modeled  as  protocols  in  a  way  that  makes 
the  definitions  agree. 
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